split_line

Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen angezeigt.

Link zu dieser Vergleichsansicht

Beide Seiten der vorigen Revision Vorhergehende Überarbeitung
Nächste Überarbeitung
Vorhergehende Überarbeitung
Letzte ÜberarbeitungBeide Seiten der Revision
split_line [23.02.2006 12:53 (vor 18 Jahren)] – (alte Version wieder hergestellt) 127.0.0.1split_line [25.02.2006 13:32 (vor 18 Jahren)] cwacha
Zeile 1: Zeile 1:
-===== Split Line - A Clean and Small String Tokenizer ===== 
-=== Overview === 
-//split_line// is a clean STL string tokenizer written in C++ in less than 100 lines of code. In its simplest form it creates a vector of strings with the tokens from a line of text separated at space, tab, carriage return and newline. In its most complex form it supports user provided delimiters, a user provided quote character, a user provided escape character, a special character for comments and limited abilities to resume tokenization with another part of the string. 
- 
-=== Features === 
-  * splits a line of text into words delimited by one or more delimiters 
-  * user can provide delimiters (defaults to \t\r\n and space) 
-  * user can provide one special character for quoted text (defaults to ") 
-  * user can provide one special escape character (defaults to \) 
-  * user can provide one special character for comments (disabled by default) 
-  * limited support to resume at another part of the string 
- 
-=== Download === 
-  * {{projects:split_line-1.0.zip}} 
- 
-=== Code Example === 
- 
-<code cpp> 
-int main(int argc, char *argv[]) { 
-    vector<string> tokens; 
-    string line = "Writing    programs     \"in C++\"  is   \ 
-     Fun!!"; 
- 
-    split_line(tokens, line); 
- 
-    cout << "Tokens:" << endl; 
-    for(unsigned int i = 0; i < tokens.size(); i++) 
-        cout << "'" << tokens[i] << "'" << endl; 
-  
-    return 0; 
-} 
-</code> 
- 
-Output: 
-<code> 
-Tokens: 
-'Writing' 
-'programs' 
-'in C++' 
-'is' 
-'Fun!!' 
-</code> 
-=== Documentation === 
- 
-A more complex example can be found in [[configparser]] in function readFile(). The function resembles a state machine with 5 states (see enum SPLIT_LINE_STATE). It is possible to provide the starting state of the machine which gives you the ability to resume tokenization of a string in some cases. In resuming mode (start_state != SL_NORMAL) the read in characters are appended to the last string in the string vector //ret// until the state switches back to SL_NORMAL. In [[configparser]] this behaviour was used to read in multiline values. However this features does not give you the ability to split a string anywhere yourself and then pass it over to split_line (using the return state as new start_state). The outcome will be different from what you might expect in most cases! 
- 
-<code cpp> 
-enum { 
- SL_NORMAL, 
- SL_ESCAPE, 
- SL_SAFEMODE, 
- SL_SAFEESCAPE, 
- SL_COMMENT, 
-} SPLIT_LINE_STATE; 
- 
-// splits line into tokens and stores them in ret. Supports delimiters, escape characters, 
-// ignores special characters between safemode_char and between comment_char and line end '\n'. 
-// returns SPLIT_LINE_STATE the parser was in when returning 
-int split_line(std::vector<std::string>& ret, std::string& line, const std::string& delimiters = " \t\r\n", char escape_char = '\\', char safemode_char = '"', char comment_char = '\0', int start_state = SL_NORMAL); 
-</code> 
- 
-== State Diagram == 
- 
-{{ projects:splitline.png }} 
- 
-**Legend** 
- 
-  * character read in / action 
-  * eat: append the character to the current token 
-  * finish: append token to token list and start with a new token 
- 
- 
-=== License === 
- 
-<html> 
- 
-<!-- Creative Commons License --> 
-<a href="http://creativecommons.org/licenses/GPL/2.0/"> 
-<img alt="CC-GNU GPL" border="0" src="http://creativecommons.org/images 
-/public/cc-GPL-a.png" /></a><br /> 
-This software is licensed under the <a href="http://creativecommons.org/licenses/GPL/2.0/">CC-GNU GPL</a>. 
-<!-- /Creative Commons License --> 
- 
-<!-- 
- 
-<rdf:RDF xmlns="http://web.resource.org/cc/" 
-    xmlns:dc="http://purl.org/dc/elements/1.1/" 
-    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> 
-<Work rdf:about=""> 
-   <license rdf:resource="http://creativecommons.org/licenses/GPL/2.0/" /> 
-   <dc:type rdf:resource="http://purl.org/dc/dcmitype/Software" /> 
-</Work> 
- 
-<License rdf:about="http://creativecommons.org/licenses/GPL/2.0/"> 
-<permits rdf:resource="http://web.resource.org/cc/Reproduction" /> 
-   <permits rdf:resource="http://web.resource.org/cc/Distribution" /> 
-   <requires rdf:resource="http://web.resource.org/cc/Notice" /> 
-   <permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" /> 
-   <requires rdf:resource="http://web.resource.org/cc/ShareAlike" /> 
-   <requires rdf:resource="http://web.resource.org/cc/SourceCode" /> 
-</License> 
- 
-</rdf:RDF> 
- 
---> 
- 
-</html> 
- 
  
  • split_line.txt
  • Zuletzt geändert: 16.11.2016 23:18 (vor 8 Jahren)
  • von 127.0.0.1