Not logged in. · Lost password · Register
Forum: General Help and Support Development RSS
Which parser (lexer) does DokuWiki use?
Avatar
og #1
Member since May 2006 · 436 posts · Location: Bayern
Group memberships: Members
Show profile · Link to this post
Subject: Which parser (lexer) does DokuWiki use?
What is the internal parser made of? Is it a custom library/class or an other opensource lib?
Is the parser useable for own plugins, e.g. by createing an instance, giving terminals and tokens?
Oli...
Avatar
turnermm (Moderator) #2
Member since Oct 2009 · 4781 posts · Location: Canada
Group memberships: Global Moderators, Members, Super Mods
Show profile · Link to this post
There is a description of how Dokuwiki parses files: https://www.dokuwiki.org/devel:parser.  While I understand the basic processes, I've never had the patience to master the details.  But if you simply want to change the output of the parser, i.e. the HTML that it produces, you can create a renderer plugin: https://www.dokuwiki.org/devel:renderer_plugins. In effect, renderers are an extension of the renderer type they are replacing, xhtml or metadata. So you only have to implement the class methods which you want to affect.  Check parser/xhtml.php for all of the possible methods (and their parameters) and then look at some of the renderer plugins, to see what others have done.
Myron Turner
github: https://github.com/turnermm
plugins, templates: http://www.mturner.org/devel
Avatar
og #3
Member since May 2006 · 436 posts · Location: Bayern
Group memberships: Members
Show profile · Link to this post
Each plugin developer, who has a bit more complex syntax, had to master the same challenge: build a parser for it's own syntax. Maybe it's of no use, but i would like to see a handout explaining/guideing developers how to do this. Maybe provide some functions or classes for it also.
What do you think?

To start trivial, my syntax goes like '<database2 OPTION...>', where i had to parse "OPTION...", a whitespace-delimited list of arbitrary plugin-options. Each option can be given as keyword, as key/value-pair with single or double quotes and extra whitespaces between key and equation-char and value:
  keyword
  key="value"
  key = "value"
  key='value'
  key = 'value'
Oli...
Avatar
turnermm (Moderator) #4
Member since Oct 2009 · 4781 posts · Location: Canada
Group memberships: Global Moderators, Members, Super Mods
Show profile · Link to this post
Why can't you handle this in the handler function of a syntax plugin?
Myron Turner
github: https://github.com/turnermm
plugins, templates: http://www.mturner.org/devel
Avatar
og #5
Member since May 2006 · 436 posts · Location: Bayern
Group memberships: Members
Show profile · Link to this post
I do, but in it i get a string of the users input which i need to parse. The result should be an SQL term, which can be cached. This term gets executed inside the renderer, which also is able to cache its output, which is mostly a bad idea with external data sources. But in my case its okay because data is only updated via this plugin, so i can invalidate the cache on this event.

Now, i did write manx different parsers, all from scratch and because this is a common task, i like to make my life easier with a method. I read through some libs available, most of them are a bit overkill in this context here. Maybe its not so hard making an own one, one which is optimized for dokuwko syntax plugin use and can get reused by others.
Oli...
Avatar
turnermm (Moderator) #6
Member since Oct 2009 · 4781 posts · Location: Canada
Group memberships: Global Moderators, Members, Super Mods
Show profile · Link to this post
I'm not sure this will help, but it's just a thought.

Are you familiar with Lexer->addPattern?  See https://www.dokuwiki.org/devel:syntax_plug…?s[]=addpatte….

For an example of its use: https://github.com/turnermm/htmlOKay/blob/master/syntax.php.  Scroll down to the connect() function.

This requires a relatively structured syntax on your page, but you don't have to delimit each potential token with a separate pattern.  Instead, everything between the entryPattern and the exitPattern is parsed for a match to one of the added Patterns and, if found, is  sent to the DOKU_LEXER_MATCHED case.  See the handle function in htmlOKay.
Myron Turner
github: https://github.com/turnermm
plugins, templates: http://www.mturner.org/devel
Avatar
andi (Administrator) #7
User title: splitbrain
Member since May 2006 · 3520 posts · Location: Berlin Germany
Group memberships: Administrators, Members
Show profile · Link to this post
I parsed a (slightly more complex) key-value syntax in a plugin here (doing it in the handler step) using a regular expression:

https://github.com/cosmocode/csv/blob/master/helper.php#L5…

Maybe that helps as an inspiration?
Read this if you don't get any useful answers.
Lies dies wenn du keine hilfreichen Antworten bekommst.
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Go to forum
Imprint
This board is powered by the Unclassified NewsBoard software, 20150713-dev, © 2003-2015 by Yves Goergen
Current time: 2020-02-17, 08:45:07 (UTC +01:00)