strToHex ( string to its hex representation as string)
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
$begingroup$
I want to convert strings to their hex representations as strings too (like hex dump programs), for example "abz"
to "61627A"
.
char * strToHex( char * str )
{
int length = strlen ( str );
char * newStr = malloc( length * 2 );
if ( !newStr ) shutDown ( "can't alloc memory" ) ;
for ( int x = 0; x < length; x++){
char y = str[ x ];
sprintf ( newStr + x * 2, "%02X", y );
}
return newStr;
}
ShutDown
definition is omitted here, it is a function that calls perror
and exit()
I designed strToHex
to be used like
char * str = "abcdefghijklmnopqrstuvwxyz";
char * hex = strToHex(str);
printf("%sn",hex);
//outputs : 6162636465666768696A6B6C6D6E6F707172737475767778797A
beginner c strings
$endgroup$
add a comment |
$begingroup$
I want to convert strings to their hex representations as strings too (like hex dump programs), for example "abz"
to "61627A"
.
char * strToHex( char * str )
{
int length = strlen ( str );
char * newStr = malloc( length * 2 );
if ( !newStr ) shutDown ( "can't alloc memory" ) ;
for ( int x = 0; x < length; x++){
char y = str[ x ];
sprintf ( newStr + x * 2, "%02X", y );
}
return newStr;
}
ShutDown
definition is omitted here, it is a function that calls perror
and exit()
I designed strToHex
to be used like
char * str = "abcdefghijklmnopqrstuvwxyz";
char * hex = strToHex(str);
printf("%sn",hex);
//outputs : 6162636465666768696A6B6C6D6E6F707172737475767778797A
beginner c strings
$endgroup$
3
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminatenewStr
, and I was unlucky the couple of tests that I run didn't fail.
$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
I want to convert strings to their hex representations as strings too (like hex dump programs), for example "abz"
to "61627A"
.
char * strToHex( char * str )
{
int length = strlen ( str );
char * newStr = malloc( length * 2 );
if ( !newStr ) shutDown ( "can't alloc memory" ) ;
for ( int x = 0; x < length; x++){
char y = str[ x ];
sprintf ( newStr + x * 2, "%02X", y );
}
return newStr;
}
ShutDown
definition is omitted here, it is a function that calls perror
and exit()
I designed strToHex
to be used like
char * str = "abcdefghijklmnopqrstuvwxyz";
char * hex = strToHex(str);
printf("%sn",hex);
//outputs : 6162636465666768696A6B6C6D6E6F707172737475767778797A
beginner c strings
$endgroup$
I want to convert strings to their hex representations as strings too (like hex dump programs), for example "abz"
to "61627A"
.
char * strToHex( char * str )
{
int length = strlen ( str );
char * newStr = malloc( length * 2 );
if ( !newStr ) shutDown ( "can't alloc memory" ) ;
for ( int x = 0; x < length; x++){
char y = str[ x ];
sprintf ( newStr + x * 2, "%02X", y );
}
return newStr;
}
ShutDown
definition is omitted here, it is a function that calls perror
and exit()
I designed strToHex
to be used like
char * str = "abcdefghijklmnopqrstuvwxyz";
char * hex = strToHex(str);
printf("%sn",hex);
//outputs : 6162636465666768696A6B6C6D6E6F707172737475767778797A
beginner c strings
beginner c strings
edited Apr 7 at 2:01
mdfst13
17.9k62257
17.9k62257
asked Apr 6 at 23:02
Accountant مAccountant م
22418
22418
3
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminatenewStr
, and I was unlucky the couple of tests that I run didn't fail.
$endgroup$
– Accountant م
2 days ago
add a comment |
3
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminatenewStr
, and I was unlucky the couple of tests that I run didn't fail.
$endgroup$
– Accountant م
2 days ago
3
3
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
2
2
$begingroup$
Won't
printf()
require hex
to have a trailing
byte?$endgroup$
– jochen
2 days ago
$begingroup$
Won't
printf()
require hex
to have a trailing
byte?$endgroup$
– jochen
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .
void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .
void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminate
newStr
, and I was unlucky the couple of tests that I run didn't fail.$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminate
newStr
, and I was unlucky the couple of tests that I run didn't fail.$endgroup$
– Accountant م
2 days ago
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
Bug
As Carsten points out, you need to allocate $(text{length}cdot 2)+1$ bytes, rather than $(text{length}cdot2)$ to account for the null terminator sprintf()
adds.
Formatting
Most C formatting guides do not include spaces around the arguments to function calls, nor the expressions within an if-statement. For an example of a C style most C programmers would find acceptable, see OpenBSD's style(9)
manual.
I choose to associate *
with the variable name, rather than floating between the type and name. This disambiguates the following example:
int *a, b;
Here, a
is a pointer to an integer, but b
is only an integer. By moving the asterisk next to the name, it makes this clearer.
int length = strlen ( str );
char * newStr = malloc (length * 2 );
if ( !newStr) shutDown ( "can't allocate memory" ) ;
Becomes:
int const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
shutDown("can't allocate memory");
}
Error checking
Rather than calling shutDown()
and exit()
ing the program, you should instead return an error value which can be checked by the caller of str_to_hex()
. Because you return a pointer, you can return NULL
to indicate an error occurred and the caller should check errno
.
Likewise, on some systems your program can incorrectly exit when length == 0
. If we look at the manual page for malloc(3)
:
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.
So by returning NULL
we account for the case where malloc(3)
returns NULL on success.
if (new_str == NULL) {
shutDown("can't alloc memory");
}
Becomes:
if (new_str == NULL) {
return NULL;
}
If you choose, you can also check if str
is NULL before calling strlen()
. This is up to you, and it's not uncommon in C to ignore this case and leave it as user error.
Looping
Use the size_t
type in your loop rather than int
. size_t
is guaranteed be wide enough to hold any array index, while int
is not.
Using i
rather than x
is more common for looping variables.
The y
variable isn't needed. You can simply use str[i]
in its place.
In terms of performance there's likely a faster option than using sprintf()
. You should look into strtol(3)
.
Conclusion
Here is the code I ended up with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
str_to_hex(char const *const str)
{
size_t const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
return NULL;
}
for (size_t i = 0; i < len; ++i) {
sprintf(new_str + i * 2, "%02X", str[i]);
}
return new_str;
}
int
main(void)
{
char *str = "abz";
char *hex = str_to_hex(str);
if (hex == NULL && strlen(str) != 0) {
/* error ... */
}
printf("%sn",hex);
free(hex);
}
Hope this helps!
$endgroup$
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function likechar * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
add a comment |
$begingroup$
In my opinion, the most severe problem is "Insufficient target memory".
int length = strlen ( str );
char * newStr = malloc( length * 2 );
You are allocating twice the length of str
, which is enough for all the hex characters (two hex chars per input byte).
But sprintf
works different: "A terminating null character is automatically appended after the content" (see here).
So the last call to sprintf
will write a terminating zero byte right after newStr
, into unallocated memory. This might provoke all kinds of unintended behaviour, including (but not limited to) crashes.
$endgroup$
$begingroup$
Yes thank you, it's a bug. I forgot to terminatenewStr
, thanks for highlighting this as it's the biggest problem in my code.
$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
Just one addition: like asprintf
vs snprintf
. One can effectively predict the size, so I would think it natural to have a string buffer and the size passed in instead of creating it dynamically.
#include <stdlib.h> /* strtol */
#include <string.h> /* strlen */
#include <stdio.h> /* printf */
#include <assert.h> /* assert */
/** Converts {str} to the underlying bit representation in hex, stored in
{hex}. It may fail to compute the entire string due to {hex_size}, in which
case the return will be less then the {str} length.
str: A valid null-terminated string.
hex: The output string.
hex_size: The output string's size.
return: The number of characters from the original that it processed. */
static size_t strToHex(const char *str, char *hex, size_t hex_size)
{
static const char digits[0x0F] = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'A', 'B', 'C', 'E', 'F' };
const size_t str_len = strlen(str), hex_len = hex_size - 1;
const size_t length = str_len < hex_len / 2 ? str_len : hex_len / 2;
const char *s = str;
char *h = hex;
size_t x;
assert(str && hex);
if(!hex_size) return 0;
for(x = 0; x < length; x++)
*h++ = digits[(*s & 0xF0) >> 4], *h++ = digits[*s++ & 0x0F];
*h = '';
return s - str;
}
int main(void)
{
const char *str = "abcdefghijklmnopqrstuvwxyz", *str2 = "æôƌԹظⓐa";
char hex[80];
size_t ret;
ret = strToHex(str, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, sizeof hex / 2);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, 0);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str2, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str2, hex, (unsigned long)ret);
return EXIT_SUCCESS;
}
It cannot really fail if given the proper input, so this simplifies error checking a lot, especially in C
. malloc
and sprintf
are pretty slow functions, comparatively, so I expect this to be faster and more robust.
$endgroup$
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
The key thing here is the absence ofmalloc
, which makes it much easier to use. Not thatmalloc
is bad, eg,asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.
$endgroup$
– Neil Edelman
yesterday
add a comment |
$begingroup$
I did more tests on the function today and found another Bug (shame on me), and AFAIK on code review I can't change the original code in the question since it got reviews.
if there are bytes have values more than 127 it will be all displayed as FF
by the function. To reproduce
char str = {127,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts 7F (NORMAL)
//now try with this
char str = {128,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts FF (BUG)
It appears if the function is used with non English characters because they are stored with the most significant bit is set 1
in UTF-8
The Fix
To Fix it, replace this line
sprintf ( newStr + x * 2, "%02X", y );
with this
sprintf ( newStr + x * 2, "%02hhX", y ); // added hh
This is because y
is of type char
or signed char
and the X
specifier expects the argument to be unsigned int
if no length is provided, so we provided length hh
to tell the function that X
is unsigned char
. Check the length table of printf.
If we didn't provided hh
, the sprintf
function is going to promote Y
from signed char
to unsigned int
and this promotion will go like this
when we defined the str
as char and assigned the value 128 to it, it's represented as
1000 0000
The compiler thought it is -128 because it's type is signed char, now function sprintf
wants to promote it to unsigned int, so to represent -128 in size of int, it will be like
1111 1111 1111 1111 1111 1111 1000 0000
^^^^ ^^^^ ^^^^ ^^^^
and because we chose to show only 2 digits then we see the last 2 bytes FF
.
more info are here , and here
$endgroup$
1
$begingroup$
This is only an issue on implementations that treatchar
as signed. If achar
is unsigned it isn't a problem. Other possible fixes include declaringy
as anunsigned char
, castingy
to anunsigned char
in thesprintf
call, or masking it (y & 0xFF
).
$endgroup$
– 1201ProgramAlarm
2 days ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216992%2fstrtohex-string-to-its-hex-representation-as-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Bug
As Carsten points out, you need to allocate $(text{length}cdot 2)+1$ bytes, rather than $(text{length}cdot2)$ to account for the null terminator sprintf()
adds.
Formatting
Most C formatting guides do not include spaces around the arguments to function calls, nor the expressions within an if-statement. For an example of a C style most C programmers would find acceptable, see OpenBSD's style(9)
manual.
I choose to associate *
with the variable name, rather than floating between the type and name. This disambiguates the following example:
int *a, b;
Here, a
is a pointer to an integer, but b
is only an integer. By moving the asterisk next to the name, it makes this clearer.
int length = strlen ( str );
char * newStr = malloc (length * 2 );
if ( !newStr) shutDown ( "can't allocate memory" ) ;
Becomes:
int const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
shutDown("can't allocate memory");
}
Error checking
Rather than calling shutDown()
and exit()
ing the program, you should instead return an error value which can be checked by the caller of str_to_hex()
. Because you return a pointer, you can return NULL
to indicate an error occurred and the caller should check errno
.
Likewise, on some systems your program can incorrectly exit when length == 0
. If we look at the manual page for malloc(3)
:
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.
So by returning NULL
we account for the case where malloc(3)
returns NULL on success.
if (new_str == NULL) {
shutDown("can't alloc memory");
}
Becomes:
if (new_str == NULL) {
return NULL;
}
If you choose, you can also check if str
is NULL before calling strlen()
. This is up to you, and it's not uncommon in C to ignore this case and leave it as user error.
Looping
Use the size_t
type in your loop rather than int
. size_t
is guaranteed be wide enough to hold any array index, while int
is not.
Using i
rather than x
is more common for looping variables.
The y
variable isn't needed. You can simply use str[i]
in its place.
In terms of performance there's likely a faster option than using sprintf()
. You should look into strtol(3)
.
Conclusion
Here is the code I ended up with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
str_to_hex(char const *const str)
{
size_t const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
return NULL;
}
for (size_t i = 0; i < len; ++i) {
sprintf(new_str + i * 2, "%02X", str[i]);
}
return new_str;
}
int
main(void)
{
char *str = "abz";
char *hex = str_to_hex(str);
if (hex == NULL && strlen(str) != 0) {
/* error ... */
}
printf("%sn",hex);
free(hex);
}
Hope this helps!
$endgroup$
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function likechar * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
add a comment |
$begingroup$
Bug
As Carsten points out, you need to allocate $(text{length}cdot 2)+1$ bytes, rather than $(text{length}cdot2)$ to account for the null terminator sprintf()
adds.
Formatting
Most C formatting guides do not include spaces around the arguments to function calls, nor the expressions within an if-statement. For an example of a C style most C programmers would find acceptable, see OpenBSD's style(9)
manual.
I choose to associate *
with the variable name, rather than floating between the type and name. This disambiguates the following example:
int *a, b;
Here, a
is a pointer to an integer, but b
is only an integer. By moving the asterisk next to the name, it makes this clearer.
int length = strlen ( str );
char * newStr = malloc (length * 2 );
if ( !newStr) shutDown ( "can't allocate memory" ) ;
Becomes:
int const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
shutDown("can't allocate memory");
}
Error checking
Rather than calling shutDown()
and exit()
ing the program, you should instead return an error value which can be checked by the caller of str_to_hex()
. Because you return a pointer, you can return NULL
to indicate an error occurred and the caller should check errno
.
Likewise, on some systems your program can incorrectly exit when length == 0
. If we look at the manual page for malloc(3)
:
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.
So by returning NULL
we account for the case where malloc(3)
returns NULL on success.
if (new_str == NULL) {
shutDown("can't alloc memory");
}
Becomes:
if (new_str == NULL) {
return NULL;
}
If you choose, you can also check if str
is NULL before calling strlen()
. This is up to you, and it's not uncommon in C to ignore this case and leave it as user error.
Looping
Use the size_t
type in your loop rather than int
. size_t
is guaranteed be wide enough to hold any array index, while int
is not.
Using i
rather than x
is more common for looping variables.
The y
variable isn't needed. You can simply use str[i]
in its place.
In terms of performance there's likely a faster option than using sprintf()
. You should look into strtol(3)
.
Conclusion
Here is the code I ended up with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
str_to_hex(char const *const str)
{
size_t const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
return NULL;
}
for (size_t i = 0; i < len; ++i) {
sprintf(new_str + i * 2, "%02X", str[i]);
}
return new_str;
}
int
main(void)
{
char *str = "abz";
char *hex = str_to_hex(str);
if (hex == NULL && strlen(str) != 0) {
/* error ... */
}
printf("%sn",hex);
free(hex);
}
Hope this helps!
$endgroup$
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function likechar * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
add a comment |
$begingroup$
Bug
As Carsten points out, you need to allocate $(text{length}cdot 2)+1$ bytes, rather than $(text{length}cdot2)$ to account for the null terminator sprintf()
adds.
Formatting
Most C formatting guides do not include spaces around the arguments to function calls, nor the expressions within an if-statement. For an example of a C style most C programmers would find acceptable, see OpenBSD's style(9)
manual.
I choose to associate *
with the variable name, rather than floating between the type and name. This disambiguates the following example:
int *a, b;
Here, a
is a pointer to an integer, but b
is only an integer. By moving the asterisk next to the name, it makes this clearer.
int length = strlen ( str );
char * newStr = malloc (length * 2 );
if ( !newStr) shutDown ( "can't allocate memory" ) ;
Becomes:
int const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
shutDown("can't allocate memory");
}
Error checking
Rather than calling shutDown()
and exit()
ing the program, you should instead return an error value which can be checked by the caller of str_to_hex()
. Because you return a pointer, you can return NULL
to indicate an error occurred and the caller should check errno
.
Likewise, on some systems your program can incorrectly exit when length == 0
. If we look at the manual page for malloc(3)
:
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.
So by returning NULL
we account for the case where malloc(3)
returns NULL on success.
if (new_str == NULL) {
shutDown("can't alloc memory");
}
Becomes:
if (new_str == NULL) {
return NULL;
}
If you choose, you can also check if str
is NULL before calling strlen()
. This is up to you, and it's not uncommon in C to ignore this case and leave it as user error.
Looping
Use the size_t
type in your loop rather than int
. size_t
is guaranteed be wide enough to hold any array index, while int
is not.
Using i
rather than x
is more common for looping variables.
The y
variable isn't needed. You can simply use str[i]
in its place.
In terms of performance there's likely a faster option than using sprintf()
. You should look into strtol(3)
.
Conclusion
Here is the code I ended up with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
str_to_hex(char const *const str)
{
size_t const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
return NULL;
}
for (size_t i = 0; i < len; ++i) {
sprintf(new_str + i * 2, "%02X", str[i]);
}
return new_str;
}
int
main(void)
{
char *str = "abz";
char *hex = str_to_hex(str);
if (hex == NULL && strlen(str) != 0) {
/* error ... */
}
printf("%sn",hex);
free(hex);
}
Hope this helps!
$endgroup$
Bug
As Carsten points out, you need to allocate $(text{length}cdot 2)+1$ bytes, rather than $(text{length}cdot2)$ to account for the null terminator sprintf()
adds.
Formatting
Most C formatting guides do not include spaces around the arguments to function calls, nor the expressions within an if-statement. For an example of a C style most C programmers would find acceptable, see OpenBSD's style(9)
manual.
I choose to associate *
with the variable name, rather than floating between the type and name. This disambiguates the following example:
int *a, b;
Here, a
is a pointer to an integer, but b
is only an integer. By moving the asterisk next to the name, it makes this clearer.
int length = strlen ( str );
char * newStr = malloc (length * 2 );
if ( !newStr) shutDown ( "can't allocate memory" ) ;
Becomes:
int const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
shutDown("can't allocate memory");
}
Error checking
Rather than calling shutDown()
and exit()
ing the program, you should instead return an error value which can be checked by the caller of str_to_hex()
. Because you return a pointer, you can return NULL
to indicate an error occurred and the caller should check errno
.
Likewise, on some systems your program can incorrectly exit when length == 0
. If we look at the manual page for malloc(3)
:
Return Value
The malloc() and calloc() functions return a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, these functions return NULL. NULL may also be returned by a successful call to malloc() with a size of zero, or by a successful call to calloc() with nmemb or size equal to zero.
So by returning NULL
we account for the case where malloc(3)
returns NULL on success.
if (new_str == NULL) {
shutDown("can't alloc memory");
}
Becomes:
if (new_str == NULL) {
return NULL;
}
If you choose, you can also check if str
is NULL before calling strlen()
. This is up to you, and it's not uncommon in C to ignore this case and leave it as user error.
Looping
Use the size_t
type in your loop rather than int
. size_t
is guaranteed be wide enough to hold any array index, while int
is not.
Using i
rather than x
is more common for looping variables.
The y
variable isn't needed. You can simply use str[i]
in its place.
In terms of performance there's likely a faster option than using sprintf()
. You should look into strtol(3)
.
Conclusion
Here is the code I ended up with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *
str_to_hex(char const *const str)
{
size_t const len = strlen(str);
char *const new_str = malloc(1 + len * 2);
if (new_str == NULL) {
return NULL;
}
for (size_t i = 0; i < len; ++i) {
sprintf(new_str + i * 2, "%02X", str[i]);
}
return new_str;
}
int
main(void)
{
char *str = "abz";
char *hex = str_to_hex(str);
if (hex == NULL && strlen(str) != 0) {
/* error ... */
}
printf("%sn",hex);
free(hex);
}
Hope this helps!
edited 2 days ago
answered Apr 7 at 1:17
esoteesote
3,02611241
3,02611241
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function likechar * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
add a comment |
2
$begingroup$
Won'tprintf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function likechar * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
2
2
$begingroup$
Won't
printf()
require hex
to have a trailing
byte?$endgroup$
– jochen
2 days ago
$begingroup$
Won't
printf()
require hex
to have a trailing
byte?$endgroup$
– jochen
2 days ago
4
4
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
You should allocate 2*len+1 bytes.
$endgroup$
– Carsten S
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
sprintf adds a null terminator. @Carsten thanks, I forgot to include that, I'll update my answer
$endgroup$
– esote
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function like
char * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?$endgroup$
– Accountant م
2 days ago
$begingroup$
Thank you very much I'll consider every point seriously. regarding the functions definition, you wrote the return type on a separate line then on the next line you continue the function like
char * nstr_to_hex(char const *const str)n
is this convention has a name or reference that I can refer to ?$endgroup$
– Accountant م
2 days ago
1
1
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
$begingroup$
@Accountantم It is from the OpenBSD manual, the style is called BSD kernel normal form (KNF). Around the middle of manual: "The function type should be on a line by itself preceding the function."
$endgroup$
– esote
2 days ago
add a comment |
$begingroup$
In my opinion, the most severe problem is "Insufficient target memory".
int length = strlen ( str );
char * newStr = malloc( length * 2 );
You are allocating twice the length of str
, which is enough for all the hex characters (two hex chars per input byte).
But sprintf
works different: "A terminating null character is automatically appended after the content" (see here).
So the last call to sprintf
will write a terminating zero byte right after newStr
, into unallocated memory. This might provoke all kinds of unintended behaviour, including (but not limited to) crashes.
$endgroup$
$begingroup$
Yes thank you, it's a bug. I forgot to terminatenewStr
, thanks for highlighting this as it's the biggest problem in my code.
$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
In my opinion, the most severe problem is "Insufficient target memory".
int length = strlen ( str );
char * newStr = malloc( length * 2 );
You are allocating twice the length of str
, which is enough for all the hex characters (two hex chars per input byte).
But sprintf
works different: "A terminating null character is automatically appended after the content" (see here).
So the last call to sprintf
will write a terminating zero byte right after newStr
, into unallocated memory. This might provoke all kinds of unintended behaviour, including (but not limited to) crashes.
$endgroup$
$begingroup$
Yes thank you, it's a bug. I forgot to terminatenewStr
, thanks for highlighting this as it's the biggest problem in my code.
$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
In my opinion, the most severe problem is "Insufficient target memory".
int length = strlen ( str );
char * newStr = malloc( length * 2 );
You are allocating twice the length of str
, which is enough for all the hex characters (two hex chars per input byte).
But sprintf
works different: "A terminating null character is automatically appended after the content" (see here).
So the last call to sprintf
will write a terminating zero byte right after newStr
, into unallocated memory. This might provoke all kinds of unintended behaviour, including (but not limited to) crashes.
$endgroup$
In my opinion, the most severe problem is "Insufficient target memory".
int length = strlen ( str );
char * newStr = malloc( length * 2 );
You are allocating twice the length of str
, which is enough for all the hex characters (two hex chars per input byte).
But sprintf
works different: "A terminating null character is automatically appended after the content" (see here).
So the last call to sprintf
will write a terminating zero byte right after newStr
, into unallocated memory. This might provoke all kinds of unintended behaviour, including (but not limited to) crashes.
answered 2 days ago
jvbjvb
899210
899210
$begingroup$
Yes thank you, it's a bug. I forgot to terminatenewStr
, thanks for highlighting this as it's the biggest problem in my code.
$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
Yes thank you, it's a bug. I forgot to terminatenewStr
, thanks for highlighting this as it's the biggest problem in my code.
$endgroup$
– Accountant م
2 days ago
$begingroup$
Yes thank you, it's a bug. I forgot to terminate
newStr
, thanks for highlighting this as it's the biggest problem in my code.$endgroup$
– Accountant م
2 days ago
$begingroup$
Yes thank you, it's a bug. I forgot to terminate
newStr
, thanks for highlighting this as it's the biggest problem in my code.$endgroup$
– Accountant م
2 days ago
add a comment |
$begingroup$
Just one addition: like asprintf
vs snprintf
. One can effectively predict the size, so I would think it natural to have a string buffer and the size passed in instead of creating it dynamically.
#include <stdlib.h> /* strtol */
#include <string.h> /* strlen */
#include <stdio.h> /* printf */
#include <assert.h> /* assert */
/** Converts {str} to the underlying bit representation in hex, stored in
{hex}. It may fail to compute the entire string due to {hex_size}, in which
case the return will be less then the {str} length.
str: A valid null-terminated string.
hex: The output string.
hex_size: The output string's size.
return: The number of characters from the original that it processed. */
static size_t strToHex(const char *str, char *hex, size_t hex_size)
{
static const char digits[0x0F] = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'A', 'B', 'C', 'E', 'F' };
const size_t str_len = strlen(str), hex_len = hex_size - 1;
const size_t length = str_len < hex_len / 2 ? str_len : hex_len / 2;
const char *s = str;
char *h = hex;
size_t x;
assert(str && hex);
if(!hex_size) return 0;
for(x = 0; x < length; x++)
*h++ = digits[(*s & 0xF0) >> 4], *h++ = digits[*s++ & 0x0F];
*h = '';
return s - str;
}
int main(void)
{
const char *str = "abcdefghijklmnopqrstuvwxyz", *str2 = "æôƌԹظⓐa";
char hex[80];
size_t ret;
ret = strToHex(str, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, sizeof hex / 2);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, 0);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str2, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str2, hex, (unsigned long)ret);
return EXIT_SUCCESS;
}
It cannot really fail if given the proper input, so this simplifies error checking a lot, especially in C
. malloc
and sprintf
are pretty slow functions, comparatively, so I expect this to be faster and more robust.
$endgroup$
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
The key thing here is the absence ofmalloc
, which makes it much easier to use. Not thatmalloc
is bad, eg,asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.
$endgroup$
– Neil Edelman
yesterday
add a comment |
$begingroup$
Just one addition: like asprintf
vs snprintf
. One can effectively predict the size, so I would think it natural to have a string buffer and the size passed in instead of creating it dynamically.
#include <stdlib.h> /* strtol */
#include <string.h> /* strlen */
#include <stdio.h> /* printf */
#include <assert.h> /* assert */
/** Converts {str} to the underlying bit representation in hex, stored in
{hex}. It may fail to compute the entire string due to {hex_size}, in which
case the return will be less then the {str} length.
str: A valid null-terminated string.
hex: The output string.
hex_size: The output string's size.
return: The number of characters from the original that it processed. */
static size_t strToHex(const char *str, char *hex, size_t hex_size)
{
static const char digits[0x0F] = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'A', 'B', 'C', 'E', 'F' };
const size_t str_len = strlen(str), hex_len = hex_size - 1;
const size_t length = str_len < hex_len / 2 ? str_len : hex_len / 2;
const char *s = str;
char *h = hex;
size_t x;
assert(str && hex);
if(!hex_size) return 0;
for(x = 0; x < length; x++)
*h++ = digits[(*s & 0xF0) >> 4], *h++ = digits[*s++ & 0x0F];
*h = '';
return s - str;
}
int main(void)
{
const char *str = "abcdefghijklmnopqrstuvwxyz", *str2 = "æôƌԹظⓐa";
char hex[80];
size_t ret;
ret = strToHex(str, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, sizeof hex / 2);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, 0);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str2, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str2, hex, (unsigned long)ret);
return EXIT_SUCCESS;
}
It cannot really fail if given the proper input, so this simplifies error checking a lot, especially in C
. malloc
and sprintf
are pretty slow functions, comparatively, so I expect this to be faster and more robust.
$endgroup$
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
The key thing here is the absence ofmalloc
, which makes it much easier to use. Not thatmalloc
is bad, eg,asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.
$endgroup$
– Neil Edelman
yesterday
add a comment |
$begingroup$
Just one addition: like asprintf
vs snprintf
. One can effectively predict the size, so I would think it natural to have a string buffer and the size passed in instead of creating it dynamically.
#include <stdlib.h> /* strtol */
#include <string.h> /* strlen */
#include <stdio.h> /* printf */
#include <assert.h> /* assert */
/** Converts {str} to the underlying bit representation in hex, stored in
{hex}. It may fail to compute the entire string due to {hex_size}, in which
case the return will be less then the {str} length.
str: A valid null-terminated string.
hex: The output string.
hex_size: The output string's size.
return: The number of characters from the original that it processed. */
static size_t strToHex(const char *str, char *hex, size_t hex_size)
{
static const char digits[0x0F] = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'A', 'B', 'C', 'E', 'F' };
const size_t str_len = strlen(str), hex_len = hex_size - 1;
const size_t length = str_len < hex_len / 2 ? str_len : hex_len / 2;
const char *s = str;
char *h = hex;
size_t x;
assert(str && hex);
if(!hex_size) return 0;
for(x = 0; x < length; x++)
*h++ = digits[(*s & 0xF0) >> 4], *h++ = digits[*s++ & 0x0F];
*h = '';
return s - str;
}
int main(void)
{
const char *str = "abcdefghijklmnopqrstuvwxyz", *str2 = "æôƌԹظⓐa";
char hex[80];
size_t ret;
ret = strToHex(str, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, sizeof hex / 2);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, 0);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str2, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str2, hex, (unsigned long)ret);
return EXIT_SUCCESS;
}
It cannot really fail if given the proper input, so this simplifies error checking a lot, especially in C
. malloc
and sprintf
are pretty slow functions, comparatively, so I expect this to be faster and more robust.
$endgroup$
Just one addition: like asprintf
vs snprintf
. One can effectively predict the size, so I would think it natural to have a string buffer and the size passed in instead of creating it dynamically.
#include <stdlib.h> /* strtol */
#include <string.h> /* strlen */
#include <stdio.h> /* printf */
#include <assert.h> /* assert */
/** Converts {str} to the underlying bit representation in hex, stored in
{hex}. It may fail to compute the entire string due to {hex_size}, in which
case the return will be less then the {str} length.
str: A valid null-terminated string.
hex: The output string.
hex_size: The output string's size.
return: The number of characters from the original that it processed. */
static size_t strToHex(const char *str, char *hex, size_t hex_size)
{
static const char digits[0x0F] = { '0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', 'A', 'B', 'C', 'E', 'F' };
const size_t str_len = strlen(str), hex_len = hex_size - 1;
const size_t length = str_len < hex_len / 2 ? str_len : hex_len / 2;
const char *s = str;
char *h = hex;
size_t x;
assert(str && hex);
if(!hex_size) return 0;
for(x = 0; x < length; x++)
*h++ = digits[(*s & 0xF0) >> 4], *h++ = digits[*s++ & 0x0F];
*h = '';
return s - str;
}
int main(void)
{
const char *str = "abcdefghijklmnopqrstuvwxyz", *str2 = "æôƌԹظⓐa";
char hex[80];
size_t ret;
ret = strToHex(str, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, sizeof hex / 2);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str, hex, 0);
printf(""%s" -> "%s" (%lu.)n", str, hex, (unsigned long)ret);
ret = strToHex(str2, hex, sizeof hex);
printf(""%s" -> "%s" (%lu.)n", str2, hex, (unsigned long)ret);
return EXIT_SUCCESS;
}
It cannot really fail if given the proper input, so this simplifies error checking a lot, especially in C
. malloc
and sprintf
are pretty slow functions, comparatively, so I expect this to be faster and more robust.
edited 2 days ago
answered 2 days ago
Neil EdelmanNeil Edelman
357110
357110
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
The key thing here is the absence ofmalloc
, which makes it much easier to use. Not thatmalloc
is bad, eg,asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.
$endgroup$
– Neil Edelman
yesterday
add a comment |
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
$begingroup$
The key thing here is the absence ofmalloc
, which makes it much easier to use. Not thatmalloc
is bad, eg,asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.
$endgroup$
– Neil Edelman
yesterday
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
$begingroup$
Thank you very much, I like the way you documented the function, is this a known convention for documenting C code ? I also like the performance consideration that this function has over the function I posted, but I generally give easier user interfaces more priority over performance, unless I get bottlenecks. I will study your code today again with a deeper look.
$endgroup$
– Accountant م
2 days ago
1
1
$begingroup$
The key thing here is the absence of
malloc
, which makes it much easier to use. Not that malloc
is bad, eg, asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.$endgroup$
– Neil Edelman
yesterday
$begingroup$
The key thing here is the absence of
malloc
, which makes it much easier to use. Not that malloc
is bad, eg, asprintf
, but I find it makes it more complex to use properly. I like to try to code so that it's easy to put into en.wikipedia.org/wiki/Doxygen, with some modifications for SO.$endgroup$
– Neil Edelman
yesterday
add a comment |
$begingroup$
I did more tests on the function today and found another Bug (shame on me), and AFAIK on code review I can't change the original code in the question since it got reviews.
if there are bytes have values more than 127 it will be all displayed as FF
by the function. To reproduce
char str = {127,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts 7F (NORMAL)
//now try with this
char str = {128,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts FF (BUG)
It appears if the function is used with non English characters because they are stored with the most significant bit is set 1
in UTF-8
The Fix
To Fix it, replace this line
sprintf ( newStr + x * 2, "%02X", y );
with this
sprintf ( newStr + x * 2, "%02hhX", y ); // added hh
This is because y
is of type char
or signed char
and the X
specifier expects the argument to be unsigned int
if no length is provided, so we provided length hh
to tell the function that X
is unsigned char
. Check the length table of printf.
If we didn't provided hh
, the sprintf
function is going to promote Y
from signed char
to unsigned int
and this promotion will go like this
when we defined the str
as char and assigned the value 128 to it, it's represented as
1000 0000
The compiler thought it is -128 because it's type is signed char, now function sprintf
wants to promote it to unsigned int, so to represent -128 in size of int, it will be like
1111 1111 1111 1111 1111 1111 1000 0000
^^^^ ^^^^ ^^^^ ^^^^
and because we chose to show only 2 digits then we see the last 2 bytes FF
.
more info are here , and here
$endgroup$
1
$begingroup$
This is only an issue on implementations that treatchar
as signed. If achar
is unsigned it isn't a problem. Other possible fixes include declaringy
as anunsigned char
, castingy
to anunsigned char
in thesprintf
call, or masking it (y & 0xFF
).
$endgroup$
– 1201ProgramAlarm
2 days ago
add a comment |
$begingroup$
I did more tests on the function today and found another Bug (shame on me), and AFAIK on code review I can't change the original code in the question since it got reviews.
if there are bytes have values more than 127 it will be all displayed as FF
by the function. To reproduce
char str = {127,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts 7F (NORMAL)
//now try with this
char str = {128,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts FF (BUG)
It appears if the function is used with non English characters because they are stored with the most significant bit is set 1
in UTF-8
The Fix
To Fix it, replace this line
sprintf ( newStr + x * 2, "%02X", y );
with this
sprintf ( newStr + x * 2, "%02hhX", y ); // added hh
This is because y
is of type char
or signed char
and the X
specifier expects the argument to be unsigned int
if no length is provided, so we provided length hh
to tell the function that X
is unsigned char
. Check the length table of printf.
If we didn't provided hh
, the sprintf
function is going to promote Y
from signed char
to unsigned int
and this promotion will go like this
when we defined the str
as char and assigned the value 128 to it, it's represented as
1000 0000
The compiler thought it is -128 because it's type is signed char, now function sprintf
wants to promote it to unsigned int, so to represent -128 in size of int, it will be like
1111 1111 1111 1111 1111 1111 1000 0000
^^^^ ^^^^ ^^^^ ^^^^
and because we chose to show only 2 digits then we see the last 2 bytes FF
.
more info are here , and here
$endgroup$
1
$begingroup$
This is only an issue on implementations that treatchar
as signed. If achar
is unsigned it isn't a problem. Other possible fixes include declaringy
as anunsigned char
, castingy
to anunsigned char
in thesprintf
call, or masking it (y & 0xFF
).
$endgroup$
– 1201ProgramAlarm
2 days ago
add a comment |
$begingroup$
I did more tests on the function today and found another Bug (shame on me), and AFAIK on code review I can't change the original code in the question since it got reviews.
if there are bytes have values more than 127 it will be all displayed as FF
by the function. To reproduce
char str = {127,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts 7F (NORMAL)
//now try with this
char str = {128,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts FF (BUG)
It appears if the function is used with non English characters because they are stored with the most significant bit is set 1
in UTF-8
The Fix
To Fix it, replace this line
sprintf ( newStr + x * 2, "%02X", y );
with this
sprintf ( newStr + x * 2, "%02hhX", y ); // added hh
This is because y
is of type char
or signed char
and the X
specifier expects the argument to be unsigned int
if no length is provided, so we provided length hh
to tell the function that X
is unsigned char
. Check the length table of printf.
If we didn't provided hh
, the sprintf
function is going to promote Y
from signed char
to unsigned int
and this promotion will go like this
when we defined the str
as char and assigned the value 128 to it, it's represented as
1000 0000
The compiler thought it is -128 because it's type is signed char, now function sprintf
wants to promote it to unsigned int, so to represent -128 in size of int, it will be like
1111 1111 1111 1111 1111 1111 1000 0000
^^^^ ^^^^ ^^^^ ^^^^
and because we chose to show only 2 digits then we see the last 2 bytes FF
.
more info are here , and here
$endgroup$
I did more tests on the function today and found another Bug (shame on me), and AFAIK on code review I can't change the original code in the question since it got reviews.
if there are bytes have values more than 127 it will be all displayed as FF
by the function. To reproduce
char str = {127,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts 7F (NORMAL)
//now try with this
char str = {128,0};
char * hex = strToHex(str);
printf("%sn",hex); //pritnts FF (BUG)
It appears if the function is used with non English characters because they are stored with the most significant bit is set 1
in UTF-8
The Fix
To Fix it, replace this line
sprintf ( newStr + x * 2, "%02X", y );
with this
sprintf ( newStr + x * 2, "%02hhX", y ); // added hh
This is because y
is of type char
or signed char
and the X
specifier expects the argument to be unsigned int
if no length is provided, so we provided length hh
to tell the function that X
is unsigned char
. Check the length table of printf.
If we didn't provided hh
, the sprintf
function is going to promote Y
from signed char
to unsigned int
and this promotion will go like this
when we defined the str
as char and assigned the value 128 to it, it's represented as
1000 0000
The compiler thought it is -128 because it's type is signed char, now function sprintf
wants to promote it to unsigned int, so to represent -128 in size of int, it will be like
1111 1111 1111 1111 1111 1111 1000 0000
^^^^ ^^^^ ^^^^ ^^^^
and because we chose to show only 2 digits then we see the last 2 bytes FF
.
more info are here , and here
answered 2 days ago
Accountant مAccountant م
22418
22418
1
$begingroup$
This is only an issue on implementations that treatchar
as signed. If achar
is unsigned it isn't a problem. Other possible fixes include declaringy
as anunsigned char
, castingy
to anunsigned char
in thesprintf
call, or masking it (y & 0xFF
).
$endgroup$
– 1201ProgramAlarm
2 days ago
add a comment |
1
$begingroup$
This is only an issue on implementations that treatchar
as signed. If achar
is unsigned it isn't a problem. Other possible fixes include declaringy
as anunsigned char
, castingy
to anunsigned char
in thesprintf
call, or masking it (y & 0xFF
).
$endgroup$
– 1201ProgramAlarm
2 days ago
1
1
$begingroup$
This is only an issue on implementations that treat
char
as signed. If a char
is unsigned it isn't a problem. Other possible fixes include declaring y
as an unsigned char
, casting y
to an unsigned char
in the sprintf
call, or masking it (y & 0xFF
).$endgroup$
– 1201ProgramAlarm
2 days ago
$begingroup$
This is only an issue on implementations that treat
char
as signed. If a char
is unsigned it isn't a problem. Other possible fixes include declaring y
as an unsigned char
, casting y
to an unsigned char
in the sprintf
call, or masking it (y & 0xFF
).$endgroup$
– 1201ProgramAlarm
2 days ago
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216992%2fstrtohex-string-to-its-hex-representation-as-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
$begingroup$
I'd be really interested to see what shutdown(char* msg) does.
$endgroup$
– pacmaninbw
Apr 7 at 0:21
$begingroup$
In the use case that was provided, since you can effectively predict the size, I would think it would be more natural to have a string buffer and the size passed in instead of creating it dynamically.
$endgroup$
– Neil Edelman
Apr 7 at 1:23
2
$begingroup$
Won't
printf()
requirehex
to have a trailingbyte?
$endgroup$
– jochen
2 days ago
$begingroup$
@pacmaninbw The argument name is actually "msg" as you guessed 😂 .
void shutDown(char * msg) { perror(msg); exit(EXIT_FAILURE); }
$endgroup$
– Accountant م
2 days ago
$begingroup$
@jochen Yes, thank you, I forgot to terminate
newStr
, and I was unlucky the couple of tests that I run didn't fail.$endgroup$
– Accountant م
2 days ago