打印十六进制转储以进行诊断

我已经开发了一些独立的函数来为缓冲区（std::string）的内容提供十六进制转储。

您可以在这里看到它的运行效果。

目的是调试诊断程序以以类似于wireshark的格式将缓冲区内容打印到（日志记录）流中。

欢迎提出任何改进建议。

代码如下：

#include <iostream>
#include <iomanip>
#include <cctype>

std::ostream& render_printable_chars(std::ostream& os, const char* buffer, size_t bufsize) {
    os << " | ";
    for (size_t i = 0; i < bufsize; ++i)
    {
        if (std::isprint(buffer[i]))
        {
            os << buffer[i];
        }
        else
        {
            os << ".";
        }
    }
    return os;
}

std::ostream& hex_dump(std::ostream& os, const uint8_t* buffer, size_t bufsize, bool showPrintableChars = true)
{
    auto oldFormat = os.flags();
    auto oldFillChar = os.fill();

    os << std::hex;
    os.fill('0');
    bool printBlank = false;
    size_t i = 0;
    for (; i < bufsize; ++i)
    {
        if (i % 8 == 0)
        {
            if (i != 0 && showPrintableChars)
            {
                render_printable_chars(os, reinterpret_cast<const char*>(&buffer[i] - 8), 8);
            }
            os << std::endl;
            printBlank = false;
        }
        if (printBlank)
        {
            os << ' ';
        }
        os << std::setw(2) << std::right << unsigned(buffer[i]);
        if (!printBlank)
        {
            printBlank = true;
        }
    }
    if (i % 8 != 0 && showPrintableChars)
    {
        for (size_t j = 0; j < 8 - (i % 8); ++j)
        {
            os << "   ";
        }
        render_printable_chars(os, reinterpret_cast<const char*>(&buffer[i] - (i % 8)), (i % 8));
    }

    os << std::endl;

    os.fill(oldFillChar);
    os.flags(oldFormat);

    return os;
}

std::ostream& hex_dump(std::ostream& os, const std::string& buffer, bool showPrintableChars = true)
{
    return hex_dump(os,reinterpret_cast<const uint8_t*>(buffer.data()), buffer.length(),showPrintableChars);
}

这里是测试用例：

int main()
{
    const char test[] = "abcdef12345661 62 63 64 65 66 31 32 | abcdef12
33 34 35 36 00 7a 79 78 | 3456.zyx
77 76 75 39 38 37 36 35 | wvu98765
34 00                   | 4.
zyxwvu987654";
    std::string s(test,sizeof(test));

    hex_dump(std::cout, s);

    return 0;
}

输出看起来像预期的一样：

q4312078q

#1 楼

您是否用8倍长的缓冲区测试过它？
因为最后一个块将无法获得漂亮的可打印显示。
您确定输出开头的换行符吗？
测试输出是否真的应该包含隐式0终止符？
return 0;对于main()是隐式的。

修改后的测试功能：

int main()
{
    const char test[] = "abcdef123456static void render_printable_chars(std::ostream& os, const char* buffer, size_t bufsize)
{
    os << " | ";
    for (size_t i = 0; i < bufsize; ++i)
        os << (std::isprint((unsigned char)buffer[i]) ? buffer[i] : '.');
}
zyxwvu987654";
    std::string s(test, sizeof(test) - 1);

    std::cout << "---\n";
    hex_dump(std::cout, s);
    std::cout << "---\n";
    hex_dump(std::cout, "");
    std::cout << "---\n";
    hex_dump(std::cout, "12345678");
    std::cout << "---\n";
    hex_dump(std::cout, "1234567812345678");
    std::cout << "---\n";
}

是否可以单独使用功能的撑杆？似乎是2 + 1，反对1。
如果只想输出一个字符，使用字符可能比字符串更有效。
条件运算符非常适合在两个值之间进行选择。
注意纯正char的实现定义的签名。从C继承的字符分类函数期望使用unsigned char或EOF的值。
请参阅std::isprint。
请考虑标记内部函数static以避免外部可见的符号并促进内联。
如果不这样做实际上从来没有使用过返回值，为什么要提供返回值？

q4312078q

不要仅仅因为一个无用的标志而玩。跟踪数据流要比遵循程序流多得多。
除了将变量设置为混淆之前，测试变量是否已经具有值之前几乎没有什么东西。
不要除非您真的要冲洗，否则请使用std::endl。并且，如果这是您的意图，请考虑使用std::flush更加明确。
如果使用C ++ 17，则可能希望按值接受std::string_view而不是接受常量引用的std::string。通常效率更高。

#2 楼

我看到一些可以帮助您改进代码的事情。

对通用参数使用void *

我确定您知道，在现代C ++中使用它有些不寻常一个void *，但这是其中一种有用的情况，因为它消除了调用方的笨拙转换。我将函数的签名更改为：

std::ostream& hex_dump(std::ostream& os, const void *buffer, std::size_t bufsize, bool showPrintableChars = true)

确保不要调用未定义的行为

如已在注释中所述如果传递给std::isprint的值不能表示为unsigned char并且不具有EOF的值，则会出现不确定的行为。我们可以通过在函数中进行方便的转换来避免这种情况，如果根据上一点，我们已经传入了const void *：

const unsigned char *buf{reinterpret_cast<const unsigned char *>(buffer)};

在处理时检查nullptr使用原始指针

为了安全起见，我建议为安全起见，在取消引用之前，应对传递给函数的任何指针进行nullptr测试。在这种情况下，我建议函数的前几行可能是这样的：

if (buffer == nullptr) {
    return os;
}

消除“魔数”

如果我希望行长为16而不是8，所以我必须努力更改代码中的每个8实例，并确保仅更改了相关点。相反，我主张使用这样的命名常量：

constexpr std::size_t maxline{8};

避免两次传递数据

确实不需要传递在数据上两次。读取每个字符时，可以以十六进制和可打印形式对其进行处理。为了解决这个问题，我建议为可打印版本创建一个小的本地缓冲区，因为我们知道它只有maxline加一个终止NUL字符：

char renderString[maxline+1];

在适当的地方使用现有变量

bufsize变量已经包含数组的大小，因此实际上并不需要引入另一个变量i来对其进行跟踪。因为它是按值传递的，所以我们实际上有一个本地副本，可以在循环中直接使用它：

for (std::size_t linecount=std::min(maxline, bufsize) ;bufsize; --bufsize, ++buf) {

结果

使用所有这些建议使我们在单一功能中获得了一个更简单，更小巧，更安全的界面。这是替代版本：

std::ostream& hex_dump(std::ostream& os, const void *buffer, 
                       std::size_t bufsize, bool showPrintableChars = true)
{
    if (buffer == nullptr) {
        return os;
    }
    auto oldFormat = os.flags();
    auto oldFillChar = os.fill();
    constexpr std::size_t maxline{8};
    // create a place to store text version of string
    char renderString[maxline+1];
    char *rsptr{renderString};
    // convenience cast
    const unsigned char *buf{reinterpret_cast<const unsigned char *>(buffer)};

    for (std::size_t linecount=maxline; bufsize; --bufsize, ++buf) {
        os << std::setw(2) << std::setfill('0') << std::hex 
           << static_cast<unsigned>(*buf) << ' ';
        *rsptr++ = std::isprint(*buf) ? *buf : '.';
        if (--linecount == 0) {
            *rsptr++ = 'int main()
{
    const char test[] = "abcdef123456basic string:
61 62 63 64 65 66 31 32  | abcdef12
33 34 35 36 00 7a 79 78  | 3456.zyx
77 76 75 39 38 37 36 35  | wvu98765
34 45 64 77 61 72 64 00  | 4Edward.

wide string:
41 00 00 00 20 00 00 00  | A... ...
77 00 00 00 69 00 00 00  | w...i...
64 00 00 00 65 00 00 00  | d...e...
20 00 00 00 73 00 00 00  |  ...s...
74 00 00 00 72 00 00 00  | t...r...
69 00 00 00 6e 00 00 00  | i...n...
67 00 00 00 2e 00 00 00  | g.......

a double
49 92 24 49 92 24 09 40  | I.$I.$.@
zyxwvu987654Edward";
    const std::string s(test,sizeof(test));
    const std::wstring s2{L"A wide string."};
    const double not_really_pi{22.0/7};

    std::cout << "\nbasic string:\n";
    hex_dump(std::cout, s.data(), s.length()*sizeof(s.front()));
    std::cout << "\nwide string:\n";
    hex_dump(std::cout, s2.data(), s2.length()*sizeof(s2.front()));
    std::cout << "\na double\n";
    hex_dump(std::cout, &not_really_pi, sizeof(not_really_pi));
    std::cout << '\n';
}
';  // terminate string
            if (showPrintableChars) {
                os << " | " << renderString;
            } 
            os << '\n';
            rsptr = renderString;
            linecount = std::min(maxline, bufsize);
        }
    }
    // emit newline if we haven't already
    if (rsptr != renderString) {
        if (showPrintableChars) {
            for (*rsptr++ = 'int main()
{
    const char test[] = "abcdef123456struct hexDump {
    const void *buffer;
    std::size_t bufsize;
    hexDump(const void *buf, std::size_t bufsz) : buffer{buf}, bufsize{bufsz} {}
    friend std::ostream &operator<<(std::ostream &out, const hexDump &hd) {
        return hex_dump(out, hd.buffer, hd.bufsize, true);
    }
};
zyxwvu987654Edward";
    const std::string s(test,sizeof(test));
    const std::wstring s2{L"A wide stringy."};
    const double not_really_pi{22.0/7};

    std::cout << "\nbasic string:\n" << hexDump(s.data(), s.length()*sizeof(s.front())) 
              << "\nwide string:\n" << hexDump(s2.data(), s2.length()*sizeof(s2.front()))
              << "\nraw char array:\n" << hexDump(test, sizeof(test))
              << "\na double\n" << hexDump(&not_really_pi, sizeof(not_really_pi)) << '\n';

}
'; rsptr != &renderString[maxline+1]; ++rsptr) {
                 os << "   ";
            }
            os << " | " << renderString;
        }
        os << '\n';
    }

    os.fill(oldFillChar);
    os.flags(oldFormat);
    return os;
}

示例用法：

q4312078q

示例输出：

q4312078q

进一步的增强功能

能够像这样使用此功能会很不错：

q4312078q

这可以通过其他几行代码来实现：

q4312078q

\ $ \ begingroup \ $
THX为宽字符串示例。我打算稍后对此进行扩展。
\ $ \ endgroup \ $
–πάνταῥεῖ
17年7月7日在13:29

\ $ \ begingroup \ $
您缺少带8个长整数倍的缓冲区的测试用例...
\ $ \ endgroup \ $
–重复数据删除器
17年7月7日在18:49

\ $ \ begingroup \ $
没有显示它们，但是在我自己的代码版本中，我使用了更多的测试用例，包括零长度缓冲区，nullptr指针以及行长度的不同值。
\ $ \ endgroup \ $
–爱德华
17年7月7日在18:55

\ $ \ begingroup \ $
哈！我看到尽管我进行了彻底的测试，但未能发布代码的更新版本！立即修复-谢谢！
\ $ \ endgroup \ $
–爱德华
17年7月7日在19:03

\ $ \ begingroup \ $
谢谢！这是如此基本，但实际上超级有用。有趣的是，我们必须努力击败C ++才能提交。大声笑。更加有趣的是，整洁的代码在任何代码段中都比我所见过的要强调更多的文本！做得好！
\ $ \ endgroup \ $
–奥利弗·舍恩洛克（OliverSchönrock）
20年1月22日在8:20

#3 楼

当您在基本执行字符集之外输入某些字符时，您的代码将调用未定义的行为。 UTF-8文字。当您的编译器将char定义为与signed char具有相同范围时，这些字符将表示为负数。仅对特殊值<cctype>允许将负数传递给EOF的函数。我认为不到5％的C程序员知道这一点，而且很容易出错。

要修复此错误，请致电std::isprint(uint8_t(buffer[i]))。

#4 楼

改进建议和相关的可能的增强方法：

常数8在每行输出的字符数处出现在许多地方。它可以使用一个有意义的名称，因此其意图很明确，可以通过一次编辑对其进行更改。

现在应该可以明显看到增强功能-允许调用者以某种方式指定chars_per_line。

#5 楼

我能想到的一种改进是在转储值之前打印缓冲区地址：

if (i % 8 == 0)
{
    if (i != 0 && showPrintableChars)
    {
        render_printable_chars(os, reinterpret_cast<const char*>(&buffer[i] - 8), 8);
    }
    os << std::endl;
    printBlank = false;
    os << (void*)&buffer[i] << ": ";
}

请参见此处。

输出看起来像然后，这样：

0x209ec20: 61 62 63 64 65 66 31 32 | abcdef12
0x209ec28: 33 34 35 36 00 7a 79 78 | 3456.zyx
0x209ec30: 77 76 75 39 38 37 36 35 | wvu98765
0x209ec38: 34 00                   | 4.

#6 楼

在我看到这篇文章之前，我已经创建了自己的实现（实时示例）：

// This template accepts a std::pair of iterators with value_type of size 1 byte.
template <typename T, typename Iter>
std::enable_if_t<
   sizeof( typename std::iterator_traits<Iter>::value_type )== 1, //std::is_same_v< typename std::iterator_traits<Iter>::value_type, char >,
   std::basic_ostream<T>
> &operator<<(std::basic_ostream<T> &os, std::pair<Iter, Iter> beginend)
{   
   auto prev_os_format = os.flags();
   auto prev_os_fill = os.fill();

   static_assert(sizeof(typename std::iterator_traits<Iter>::value_type) == 1); // Available fot bytes only
   os << std::setw(2) << std::setfill('0') << std::hex << std::uppercase;
   using namespace std;
   string ascii;
   auto const &[begin, end] = beginend;
   auto iter = begin;
   const char *newline = ""; // append \n at the begining of every line except first one, not at the end.
   while (iter != end)
   {
      os << setw(0) << newline;
      unsigned short offset = iter - begin;
      os << "0x" << setfill('0') << setw(4) << std::right << unsigned(offset) << ": ";

      auto const line_end = iter + 16;
      for( auto const line_part : {iter+8,line_end} ){
         while (iter != line_part && iter != end)
         {
            ascii += isprint(*iter) ? *iter : '.';
            os << setw(2) << unsigned(*iter) << " ";
            ++iter;
         }
         os << " ";
      }
      // align ascii representation in last line
      for (int i = 0; i < line_end - iter; ++i)
         os << "   ";

      os << " |" << setfill(' ') << setw(16) << std::left << ascii << "|";
      ascii = "";
      newline = "\n";
   }

   os.flags(prev_os_format);
   os.fill(prev_os_fill);
   return os;
}

和用法：

#include <algorithm>
#include <sstream>
#include <iostream>
#include <iterator>
#include <iomanip>
#include <string_view>
#include <string>
#include <vector>

int main()
{
   using namespace std;

   constexpr std::string_view sv0 = "hello\x02";
   cout << pair{begin(sv0), end(sv0)} << endl
        << endl;

   constexpr std::string_view sv1 = "hello world\x02khgavsd \x0B \x0A\x05Xasjhlasbdas jalsjdn\x13xa0";
   std::vector<char> v1 {begin(sv1),end(sv1)};  // will fail for std::vector<int>
   cout << pair{begin(v1), end(v1)} << endl
        << endl;

   constexpr std::string_view sv2 = "hello world\x02khgavsd \x0B \x0A\x05Xasjhlasbdas jalsjdn\x13  012345678asfd.hjbelfjdvn;kqewjnfd;lijvnbqe;jraf v;kqhjewrsljhfdbvi;jekbner;ifbsdvpi[ubep[ibuvqrub[iuqeb[iuabivwequniuweniupni]]]] ;afkdjvbnqe'orjnfavi;pjqerbipjvbqei[jrbwv[ipbqreo[iuwvfb[ioqeruwbvo[iubqrio[evbwsd[uibrqefp[iadubwerip[bvp[ieqwrubvipube9";
   cout << pair{begin(sv2), end(sv2)} << endl
        << endl;
}

和输出

❯❯❯ c++ -std=c++17 ./test-str-hex.cpp && ./a.out
0x0000: 68 65 6C 6C 6F 02                                  |hello.          |

0x0000: 68 65 6C 6C 6F 20 77 6F  72 6C 64 02 6B 68 67 61   |hello world.khga|
0x0010: 76 73 64 20 0B 20 0A 05  58 61 73 6A 68 6C 61 73   |vsd . ..Xasjhlas|
0x0020: 62 64 61 73 20 6A 61 6C  73 6A 64 6E 13 78 61 30   |bdas jalsjdn.xa0|

0x0000: 68 65 6C 6C 6F 20 77 6F  72 6C 64 02 6B 68 67 61   |hello world.khga|
0x0010: 76 73 64 20 0B 20 0A 05  58 61 73 6A 68 6C 61 73   |vsd . ..Xasjhlas|
0x0020: 62 64 61 73 20 6A 61 6C  73 6A 64 6E 13 20 20 30   |bdas jalsjdn.  0|
0x0030: 31 32 33 34 35 36 37 38  61 73 66 64 2E 68 6A 62   |12345678asfd.hjb|
0x0040: 65 6C 66 6A 64 76 6E 3B  6B 71 65 77 6A 6E 66 64   |elfjdvn;kqewjnfd|
0x0050: 3B 6C 69 6A 76 6E 62 71  65 3B 6A 72 61 66 20 76   |;lijvnbqe;jraf v|
0x0060: 3B 6B 71 68 6A 65 77 72  73 6C 6A 68 66 64 62 76   |;kqhjewrsljhfdbv|
0x0070: 69 3B 6A 65 6B 62 6E 65  72 3B 69 66 62 73 64 76   |i;jekbner;ifbsdv|
0x0080: 70 69 5B 75 62 65 70 5B  69 62 75 76 71 72 75 62   |pi[ubep[ibuvqrub|
0x0090: 5B 69 75 71 65 62 5B 69  75 61 62 69 76 77 65 71   |[iuqeb[iuabivweq|
0x00A0: 75 6E 69 75 77 65 6E 69  75 70 6E 69 5D 5D 5D 5D   |uniuweniupni]]]]|
0x00B0: 20 3B 61 66 6B 64 6A 76  62 6E 71 65 27 6F 72 6A   | ;afkdjvbnqe'orj|
0x00C0: 6E 66 61 76 69 3B 70 6A  71 65 72 62 69 70 6A 76   |nfavi;pjqerbipjv|
0x00D0: 62 71 65 69 5B 6A 72 62  77 76 5B 69 70 62 71 72   |bqei[jrbwv[ipbqr|
0x00E0: 65 6F 5B 69 75 77 76 66  62 5B 69 6F 71 65 72 75   |eo[iuwvfb[ioqeru|
0x00F0: 77 62 76 6F 5B 69 75 62  71 72 69 6F 5B 65 76 62   |wbvo[iubqrio[evb|
0x0100: 77 73 64 5B 75 69 62 72  71 65 66 70 5B 69 61 64   |wsd[uibrqefp[iad|
0x0110: 75 62 77 65 72 69 70 5B  62 76 70 5B 69 65 71 77   |ubwerip[bvp[ieqw|
0x0120: 72 75 62 76 69 70 75 62  65 39                     |rubvipube9      |

\ $ \ begingroup \ $
您提出了替代解决方案，但尚未检查代码。请编辑以显示问题代码的哪些方面促使您编写此版本，以及对原始版本进行了哪些改进。也许（重新）阅读“如何回答”是值得的。
\ $ \ endgroup \ $
– Toby Speight
20-2-11在12:31

编程黑洞网