Unterschiede
Hier werden die Unterschiede zwischen zwei Versionen angezeigt.
Beide Seiten der vorigen Revision Vorhergehende Überarbeitung Nächste Überarbeitung | Vorhergehende ÜberarbeitungLetzte ÜberarbeitungBeide Seiten der Revision | ||
split_line [06.01.2006 16:06 (vor 19 Jahren)] – cwacha | split_line [25.02.2006 13:32 (vor 18 Jahren)] – cwacha | ||
---|---|---|---|
Zeile 1: | Zeile 1: | ||
- | ===== Split Line - A Clean and Small String Tokenizer ===== | ||
- | === Overview === | ||
- | // | ||
- | |||
- | === Features === | ||
- | * splits a line of text into words delimited by one or more delimiters | ||
- | * user can provide delimiters (defaults to \t\r\n and space) | ||
- | * user can provide one special character for quoted text (defaults to ") | ||
- | * user can provide one special escape character (defaults to \) | ||
- | * user can provide one special character for comments (disabled by default) | ||
- | * limited support to resume at another part of the string | ||
- | |||
- | === Download === | ||
- | * {{projects: | ||
- | |||
- | === Code Example === | ||
- | |||
- | <code cpp> | ||
- | int main(int argc, char *argv[]) { | ||
- | vector< | ||
- | string line = " | ||
- | | ||
- | |||
- | split_line(tokens, | ||
- | |||
- | cout << " | ||
- | for(unsigned int i = 0; i < tokens.size(); | ||
- | cout << "'" | ||
- | |||
- | return 0; | ||
- | } | ||
- | </ | ||
- | |||
- | Output: | ||
- | < | ||
- | Tokens: | ||
- | ' | ||
- | ' | ||
- | 'in C++' | ||
- | ' | ||
- | ' | ||
- | </ | ||
- | === Documentation === | ||
- | |||
- | A more complex example can be found in [[configparser]] in function readFile(). The function resembles a state machine with 5 states (see enum SPLIT_LINE_STATE). It is possible to provide the starting state of the machine which gives you the ability to resume tokenization of a string in some cases. In resuming mode (start_state != SL_NORMAL) the read in characters are appended to the last string in the string vector //ret// until the state switches back to SL_NORMAL. In [[configparser]] this behaviour was used to read in multiline values. However this features does not give you the ability to split a string anywhere yourself and then pass it over to split_line (using the return state as new start_state). The outcome will be different from what you might expect in most cases! | ||
- | |||
- | <code cpp> | ||
- | enum { | ||
- | SL_NORMAL, | ||
- | SL_ESCAPE, | ||
- | SL_SAFEMODE, | ||
- | SL_SAFEESCAPE, | ||
- | SL_COMMENT, | ||
- | } SPLIT_LINE_STATE; | ||
- | |||
- | // splits line into tokens and stores them in ret. Supports delimiters, escape characters, | ||
- | // ignores special characters between safemode_char and between comment_char and line end ' | ||
- | // returns SPLIT_LINE_STATE the parser was in when returning | ||
- | int split_line(std:: | ||
- | </ | ||
- | |||
- | |||
- | === License === | ||
- | |||
- | < | ||
- | |||
- | <!-- Creative Commons License --> | ||
- | <a href=" | ||
- | <img alt=" | ||
- | / | ||
- | This software is licensed under the <a href=" | ||
- | <!-- /Creative Commons License --> | ||
- | |||
- | <!-- | ||
- | |||
- | <rdf:RDF xmlns=" | ||
- | xmlns: | ||
- | xmlns: | ||
- | <Work rdf: | ||
- | < | ||
- | < | ||
- | </ | ||
- | |||
- | <License rdf: | ||
- | <permits rdf: | ||
- | < | ||
- | < | ||
- | < | ||
- | < | ||
- | < | ||
- | </ | ||
- | |||
- | </ | ||
- | |||
- | --> | ||
- | |||
- | </ | ||
- | |||