Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl_easy_strerror man page zero terminated #5598

Closed
coinhubs opened this issue Jun 23, 2020 · 18 comments
Closed

curl_easy_strerror man page zero terminated #5598

coinhubs opened this issue Jun 23, 2020 · 18 comments

Comments

@coinhubs
Copy link

I did this

https://curl.haxx.se/libcurl/c/curl_easy_strerror.html

This page says "zero terminated string"

I expected the following

"NUL terminated string"

There may be other parts of the manual that could be updated in a similar way.

curl/libcurl version

[curl -V output]

operating system

@coinhubs
Copy link
Author

@bagder
Copy link
Member

bagder commented Jun 24, 2020

I usually go with "zero" in text instead of "nul" only because "nul" is too close to "null" and they're pronounced the same way and it certainly isn't NULL terminated...

@coinhubs
Copy link
Author

I usually go with "zero" in text instead of "nul" only because "nul" is too close to "null" and they're pronounced the same way and it certainly isn't NULL terminated...

Fair enough. I think there are lots of different opinions, although not seen 'zero' before. BSD generally call it 'NUL'. ASCII and https://en.wikipedia.org/wiki/ISO/IEC_646 call it NUL

POSIX strcpy()
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strcpy.html

glibc:
https://www.gnu.org/software/libc/manual/html_node/Converting-Strings.html

The C and C++ standards don't use "zero".

I had a look at the linux man pages, they're consistently describing as "including the terminating null byte ('\0')"

https://man7.org/linux/man-pages/man3/strcpy.3.html

Cheers

@bagder
Copy link
Member

bagder commented Jun 25, 2020

My strcpy man page on Debian even uses the phrase "null-terminated".

"man ascii" describes the character as NUL '\0' (null character)

@bagder
Copy link
Member

bagder commented Jun 25, 2020

I put up a very unscientific poll on twitter about it:

https://twitter.com/bagder/status/1276074546530000896

bagder added a commit that referenced this issue Jun 25, 2020
Updated terminology in docs, comments and phrases to refer to C strings
as "null-terminated". Done to unify with how most other C oriented docs
refer of them and what users in general seem to prefer (based on a
single highly unscientific poll on twitter).

Reported-by: coinhubs on github
Fixes #5598
@coinhubs
Copy link
Author

Hi! null-terminated is the in common use. Personally I use NUL terminated, as it's clear it's not a NULL ptr, although C++ now has the nullptr keyword.

I looked online quite a few in favour of NUL

https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
GCC uses NUL "-Wno-format-contains-nul
If -Wformat is specified, do not warn about format strings that contain NUL bytes."

http://man7.org/linux/man-pages/man7/bpf-helpers.7.html

int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)

          Description
                 Copy a NUL terminated string

POSIX uses NUL
http://man7.org/linux/man-pages/man3/strncpy.3p.html
" If a NUL character is written to the destination, the stpncpy()
function shall return the address of the first such NUL character."

http://man7.org/linux/man-pages/man3/strchr.3.html

strchrnul()
It's even called NUL in the function name

C11 Annex K uses NUL
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm

@bagder
Copy link
Member

bagder commented Jun 26, 2020

If we are going to change this phrase, we should go with the common, most preferred one, or not change it at all. Then it is really not a question of which our own personal preferences are...

@bagder
Copy link
Member

bagder commented Jun 26, 2020

Also, the question is not what the character or byte is called. The question is how to say zero terminated the most common/standard way.

@jzakrzewski
Copy link
Contributor

Right. The funny thing is I typically only hear the phrase "null-terminated", so no idea how people would write it. I don't recall ever hearing "zero-terminated" (in context of strings) even though it's perfectly understandable.
As a side note: in C++ this is simply called a "c-string" :)

@coinhubs
Copy link
Author

C++ ISO/IEC
14882 uses "null-terminated"

C ISO/IEC
9899 uses "null character"

"7.1.1 Definitions of terms
1 A string is a contiguous sequence of characters terminated by and including the first null
character. The term multibyte string is sometimes used instead to emphasize special
processing given to multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial (lowest addressed)
character. The length of a string is the number of bytes preceding the null character and
the value of a string is the sequence of the values of the contained characters, in order."

@jzakrzewski
Copy link
Contributor

C++ ISO/IEC
14882 uses "null-terminated"

I meant "cstring" is used in commonly by C++ programmers (most probably because it's short ;) ).
In any case "null-terminated" seems to be the most common of calling it, so #5608 is the way to go.

@bagder
Copy link
Member

bagder commented Jun 26, 2020

uses "null character

Finish this sentence: "A C string is a sequence of bytes that is..."

  1. "null-terminated"
  2. "null character"

I'm not a native English speaker but I know we can't use (2)...

@jzakrzewski
Copy link
Contributor

I'm not a native English speaker but I know we can't use (2)...

And nobody's gonna write / wants to read "terminated by a null character".

@bagder
Copy link
Member

bagder commented Jun 26, 2020

I argue that "the string is terminated by a null character" and "the string is null-terminated" are just two sides of the same coin. In both cases we refer to the trailing character using the name "null".

@coinhubs
Copy link
Author

uses "null character

Finish this sentence: "A C string is a sequence of bytes that is..."

  1. "null-terminated"
  2. "null character"

I'm not a native English speaker but I know we can't use (2)...

Ok, yes I'm a native English speaker:

"A C string is a sequence of bytes which is null-terminated"

@coinhubs
Copy link
Author

I argue that "the string is terminated by a null character" and "the string is null-terminated" are just two sides of the same coin. In both cases we refer to the trailing character using the name "null".

yes, the NUL byte http://www.asciitable.com/ is described as "null".

...Same as LF byte is described as "line feed". For me it seems we either state the macro name, or we provide the description. It's a bit like writing "EINVAL" or "Invalid argument" in relation to errno codes. Or "SIGSEGV" vs "Segmentation Violation" etc.

@bagder
Copy link
Member

bagder commented Jun 26, 2020

I don't follow you. What exactly are you proposing?

You submit an issue saying we don't use the common terminology. We dig up the most commonly used and the most commonly preferred terminology but now that's not what you think we should use?

@coinhubs
Copy link
Author

As you say, the mostly commonly used is "null-terminated"

So to give your example, that would be:
"A C string is a sequence of bytes which is null-terminated"

@bagder bagder closed this as completed in 032e838 Jun 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants