split_line

Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen angezeigt.

Link zu dieser Vergleichsansicht

Beide Seiten der vorigen Revision Vorhergehende Überarbeitung
Letzte ÜberarbeitungBeide Seiten der Revision
split_line [25.02.2006 13:32 (vor 18 Jahren)] cwachasplit_line [25.02.2006 13:32 (vor 18 Jahren)] cwacha
Zeile 1: Zeile 1:
-===== Split Line - A Clean and Small String Tokenizer ===== 
-=== Overview === 
-//split_line// is a clean STL string tokenizer written in C++ in less than 100 lines of code. In its simplest form it creates a vector of strings with the tokens from a line of text separated at space, tab, carriage return and newline. In its most complex form it supports user provided delimiters, a user provided quote character, a user provided escape character, a special character for comments and limited abilities to resume tokenization with another part of the string. 
- 
-=== Features === 
-  * splits a line of text into words delimited by one or more delimiters 
-  * user can provide delimiters (defaults to \t\r\n and space) 
-  * user can provide one special character for quoted text (defaults to ") 
-  * user can provide one special escape character (defaults to \) 
-  * user can provide one special character for comments (disabled by default) 
-  * limited support to resume at another part of the string 
- 
-=== Download === 
-  * {{projects:split_line-1.0.zip}} 
- 
-=== Code Example === 
- 
-<code cpp> 
-int main(int argc, char *argv[]) { 
-    vector<string> tokens; 
-    string line = "Writing    programs     \"in C++\"  is   \ 
-     Fun!!"; 
- 
-    split_line(tokens, line); 
- 
-    cout << "Tokens:" << endl; 
-    for(unsigned int i = 0; i < tokens.size(); i++) 
-        cout << "'" << tokens[i] << "'" << endl; 
-  
-    return 0; 
-} 
-</code> 
- 
-Output: 
-<code> 
-Tokens: 
-'Writing' 
-'programs' 
-'in C++' 
-'is' 
-'Fun!!' 
-</code> 
-=== Documentation === 
- 
-A more complex example can be found in [[cfg_parser]] in function readFile(). The function resembles a state machine with 5 states (see enum SPLIT_LINE_STATE). It is possible to provide the starting state of the machine which gives you the ability to resume tokenization of a string in some cases. In resuming mode (start_state != SL_NORMAL) the read in characters are appended to the last string in the string vector //ret// until the state switches back to SL_NORMAL. In [[configparser]] this behaviour was used to read in multiline values. However this features does not give you the ability to split a string anywhere yourself and then pass it over to split_line (using the return state as new start_state). The outcome will be different from what you might expect in most cases! 
- 
-<code cpp> 
-enum { 
- SL_NORMAL, 
- SL_ESCAPE, 
- SL_SAFEMODE, 
- SL_SAFEESCAPE, 
- SL_COMMENT, 
-} SPLIT_LINE_STATE; 
- 
-// splits line into tokens and stores them in ret. Supports delimiters, escape characters, 
-// ignores special characters between safemode_char and between comment_char and line end '\n'. 
-// returns SPLIT_LINE_STATE the parser was in when returning 
-int split_line(std::vector<std::string>& ret, std::string& line, const std::string& delimiters = " \t\r\n", char escape_char = '\\', char safemode_char = '"', char comment_char = '\0', int start_state = SL_NORMAL); 
-</code> 
- 
-== State Diagram == 
- 
-{{ projects:splitline.png }} 
- 
-**Legend** 
- 
-  * character read in / action 
-  * eat: append the character to the current token 
-  * finish: append token to token list and start with a new token 
- 
- 
-=== License === 
- 
-<html> 
- 
-<!-- Creative Commons License --> 
-<a href="http://creativecommons.org/licenses/GPL/2.0/"> 
-<img alt="CC-GNU GPL" border="0" src="http://creativecommons.org/images 
-/public/cc-GPL-a.png" /></a><br /> 
-This software is licensed under the <a href="http://creativecommons.org/licenses/GPL/2.0/">CC-GNU GPL</a>. 
-<!-- /Creative Commons License --> 
- 
-<!-- 
- 
-<rdf:RDF xmlns="http://web.resource.org/cc/" 
-    xmlns:dc="http://purl.org/dc/elements/1.1/" 
-    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> 
-<Work rdf:about=""> 
-   <license rdf:resource="http://creativecommons.org/licenses/GPL/2.0/" /> 
-   <dc:type rdf:resource="http://purl.org/dc/dcmitype/Software" /> 
-</Work> 
- 
-<License rdf:about="http://creativecommons.org/licenses/GPL/2.0/"> 
-<permits rdf:resource="http://web.resource.org/cc/Reproduction" /> 
-   <permits rdf:resource="http://web.resource.org/cc/Distribution" /> 
-   <requires rdf:resource="http://web.resource.org/cc/Notice" /> 
-   <permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" /> 
-   <requires rdf:resource="http://web.resource.org/cc/ShareAlike" /> 
-   <requires rdf:resource="http://web.resource.org/cc/SourceCode" /> 
-</License> 
- 
-</rdf:RDF> 
- 
---> 
- 
-</html> 
- 
  
  • split_line.txt
  • Zuletzt geändert: 16.11.2016 23:18 (vor 8 Jahren)
  • von 127.0.0.1