1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
|
.TH STRING 3
.SH NAME
s_alloc, s_append, s_array, s_copy, s_error, s_free, s_incref, s_memappend, s_nappend, s_new, s_newalloc, s_parse, s_reset, s_restart, s_terminate, s_tolower, s_putc, s_unique, s_grow, s_read, s_read_line, s_getline, s_allocinstack, s_freeinstack, s_rdinstack \- extensible strings
.SH SYNOPSIS
.B #include <u.h>
.br
.B #include <libc.h>
.br
.B #include <String.h>
.PP
.ta +\w'\fLSinstack* 'u
.B
String* s_new(void)
.br
.B
void s_free(String *s)
.br
.B
String* s_newalloc(int n)
.br
.B
String* s_array(char *p, int n)
.br
.B
String* s_grow(String *s, int n)
.PP
.B
void s_putc(String *s, int c)
.br
.B
void s_terminate(String *s)
.br
.B
String* s_reset(String *s)
.br
.B
String* s_restart(String *s)
.br
.B
String* s_append(String *s, char *p)
.br
.B
String* s_nappend(String *s, char *p, int n)
.br
.B
String* s_memappend(String *s, char *p, int n)
.br
.B
String* s_copy(char *p)
.br
.B
String* s_parse(String *s1, String *s2)
.br
.PP
.B
void s_tolower(String *s)
.PP
.B
String* s_incref(String *s)
.br
.B
String* s_unique(String *s)
.PP
.B
Sinstack* s_allocinstack(char *file)
.br
.B
void s_freeinstack(Sinstack *stack)
.br
.B
char* s_rdinstack(Sinstack *stack, String *s)
.PP
.B
#include <bio.h>
.PP
.B
int s_read(Biobuf *b, String *s, int n)
.br
.B
char* s_read_line(Biobuf *b, String *s)
.br
.B
char* s_getline(Biobuf *b, String *s)
.SH DESCRIPTION
.PP
These routines manipulate extensible strings.
The basic type is
.BR String ,
which points to an array of characters. The string
maintains pointers to the beginning and end of the allocated
array. In addition a finger pointer keeps track of where
parsing will start (for
.IR s_parse )
or new characters will be added (for
.IR s_putc ,
.IR s_append ,
and
.IR s_nappend ).
The structure, and a few useful macros are:
.sp
.EX
typedef struct String {
Lock;
char *base; /* base of String */
char *end; /* end of allocated space+1 */
char *ptr; /* ptr into String */
...
} String;
#define s_to_c(s) ((s)->base)
#define s_len(s) ((s)->ptr-(s)->base)
#define s_clone(s) s_copy((s)->base)
.EE
.PP
.I S_to_c
is used when code needs a reference to the character array.
Using
.B s->base
directly is frowned upon since it exposes too much of the implementation.
.SS "allocation and freeing
.PP
A string must be allocated before it can be used.
One normally does this using
.IR s_new ,
giving the string an initial allocation of
128 bytes.
If you know that the string will need to grow much
longer, you can use
.I s_newalloc
instead, specifying the number of bytes in the
initial allocation.
.PP
.I S_free
causes both the string and its character array to be freed.
.PP
.I S_grow
grows a string's allocation by a fixed amount. It is useful if
you are reading directly into a string's character array but should
be avoided if possible.
.PP
.I S_array
is used to create a constant array, that is, one whose contents
won't change. It points directly to the character array
given as an argument. Tread lightly when using this call.
.SS "Filling the string
After its initial allocation, the string points to the beginning
of an allocated array of characters starting with
.SM NUL.
.PP
.I S_putc
writes a character into the string at the
pointer and advances the pointer to point after it.
.PP
.I S_terminate
writes a
.SM NUL
at the pointer but doesn't advance it.
.PP
.I S_restart
resets the pointer to the begining of the string but doesn't change the contents.
.PP
.I S_reset
is equivalent to
.I s_restart
followed by
.IR s_terminate .
.PP
.I S_append
and
.I s_nappend
copy characters into the string at the pointer and
advance the pointer. They also write a
.SM NUL
at
the pointer without advancing the pointer beyond it.
Both routines stop copying on encountering a
.SM NUL.
.I S_memappend
is like
.I s_nappend
but doesn't stop at a
.SM NUL.
.PP
If you know the initial character array to be copied into a string,
you can allocate a string and copy in the bytes using
.IR s_copy .
This is the equivalent of a
.I s_new
followed by an
.IR s_append .
.PP
.I S_parse
copies the next white space terminated token from
.I s1
to
the end of
.IR s2 .
White space is defined as space, tab,
and newline. Both single and double quoted strings are treated as
a single token. The bounding quotes are not copied.
There is no escape mechanism.
.PP
.I S_tolower
converts all
.SM ASCII
characters in the string to lower case.
.SS Multithreading
.PP
.I S_incref
is used by multithreaded programs to avoid having the string memory
released until the last user of the string performs an
.IR s_free .
.I S_unique
returns a unique copy of the string: if the reference count it
1 it returns the string, otherwise it returns an
.I s_clone
of the string.
.SS "Bio interaction
.PP
.I S_read
reads the requested number of characters through a
.I Biobuf
into a string. The string is grown as necessary.
An eof or error terminates the read.
The number of bytes read is returned.
The string is null terminated.
.PP
.I S_read_line
reads up to and including the next newline and returns
a pointer to the beginning of the bytes read.
An eof or error terminates the read.
The string is null terminated.
.PP
.I S_getline
reads up to the next newline, appends the input to
.IR s ,
and returns
a pointer to the beginning of the bytes read. Leading
spaces and tabs and the trailing newline are all discarded.
.I S_getline
discards blank lines and lines beginning with
.LR # .
.I S_getline
ignores
newlines escaped by immediately-preceding backslashes.
.PP
.I S_allocinstack
allocates an input stack with the single file
.I file
open for reading.
.I S_freeinstack
frees an input stack.
.I S_rdinstack
reads a line from an input stack.
It follows the same rules as
.I s_getline
except that when it encounters a line of the form
.B #include
.IR newfile ,
.I s_getline
pushes
.I newfile
onto the input stack, postponing further reading of the current
file until
.I newfile
has been read.
The input stack has a maximum depth of 32 nested include files.
.SH SOURCE
.B \*9/src/libString
.SH SEE ALSO
.IR bio (3)
|