Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing tests on MSVC with VS2019 15.9.13 x64 #1685

Closed
emmenlau opened this issue Jul 23, 2019 · 34 comments
Closed

Failing tests on MSVC with VS2019 15.9.13 x64 #1685

emmenlau opened this issue Jul 23, 2019 · 34 comments
Labels
kind: bug platform: visual studio related to MSVC state: help needed the issue needs help to proceed state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated

Comments

@emmenlau
Copy link

Any help in debugging this issue is highly appreciated!

  • What is the issue you have?
    Two tests from the test suite exit in segfault, and one test exists in timeout:
      Start 67: test-regression_default
67/88 Test #67: test-regression_default .............***Exception: SegFault  0.95 sec
      Start 68: test-regression_all
68/88 Test #68: test-regression_all .................***Exception: SegFault  0.10 sec
      Start 80: test-unicode_all
80/88 Test #80: test-unicode_all ....................***Timeout 1500.04 sec
  • Please describe the steps to reproduce the issue. Can you provide a small but working code example?
    I execute ctest in the build directory.

  • Which compiler and operating system are you using?
    I used Microsoft Visual Studio 2017 v15.9.13 together with cmake 3.15.0-rc3 and ninja to build the solution. These tools work well for me with more than 60++ software packages.

  • Did you use a released version of the library or the version from the develop branch?
    Release version 3.6.1.

@nlohmann nlohmann added the platform: visual studio related to MSVC label Jul 23, 2019
@nlohmann
Copy link
Owner

That is strange: we successfully use MSVC 19.16.27030.1 (MSVC 2017) in our build CI: https://ci.appveyor.com/project/nlohmann/json/builds/26132437/job/kxhxdypa7gcpi3ls

@emmenlau
Copy link
Author

I also find it super strange, I can hardly believe to have "discovered" such an obvious issue that nobody else experienced before. But is there something I can try to investigate? I've run test-regression.exe from cmd and reports a segfault in ucrtbased.dll that seems related to freeing memory. However I'm unable to get a better stack trace in Visual Studio debugger.

Can I send my test executable to someone for inspection?

@nlohmann
Copy link
Owner

I'm not using Windows myself, but maybe someone else does.

@nlohmann nlohmann added the state: help needed the issue needs help to proceed label Jul 23, 2019
@risa2000
Copy link

FWIW, I have run a quick (well, not so quick) check on my machine:

Visual Studio 2017 v15.9.14 (Community)
cmake version 3.12.18081601-MSVC_2
Microsoft (R) C/C++ Optimizing Compiler Version 19.16.27032.1 for x64
clang version 9.0.0 (https://github.com/llvm/llvm-project.git ec93536c20056f5503f896539ef258dd08b5d3db)
ninja 1.9.0 (this one I built from the source)
json current `develop` branch (Commit 65e4b973, 21.07.2019 14:10:37, Parent: 323cf95d)

I tried building four builds - debug ("Debug" in CMake) and release ("Release" in CMake), in MSVC (cl) and in Clang (clang-cl). Clang variant does not even finish the Cache generation in CMake and emits the error:

 Run Build Command:"D:/Bin/ninja.exe" "cmTC_32d19"
    [1/2] Building CXX object CMakeFiles\cmTC_32d19.dir\testCXXCompiler.cxx.obj
    [2/2] Linking CXX executable cmTC_32d19.exe
    FAILED: cmTC_32d19.exe 
    cmd.exe /C "cd . && "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\cmTC_32d19.dir --manifests  -- CMAKE_LINKER-NOTFOUND /nologo CMakeFiles\cmTC_32d19.dir\testCXXCompiler.cxx.obj  /out:cmTC_32d19.exe /implib:cmTC_32d19.lib /pdb:cmTC_32d19.pdb /version:0.0  /machine:x64  /debug /INCREMENTAL /subsystem:console  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
    RC Pass 1: command "rc /foCMakeFiles\cmTC_32d19.dir/manifest.res CMakeFiles\cmTC_32d19.dir/manifest.rc" failed (exit code 0) with the following output:
    The system cannot find the file specified
    ninja: build stopped: subcommand failed.

Then the Release build with MSVC has passed without any problem. The Debug build timeouts in test-unicode_all (after 1500 sec), with other tests before the timeout passing. (The Release build completes all the tests in 400 sec).

I did also two tests with VS 2019:

Visual Studio 2019 v16.1.6 (Community)
cmake version 3.14.19050301-MSVC_2
Microsoft (R) C/C++ Optimizing Compiler Version 19.21.27702.2 for x64

and did get SegFaults in the Debug build, the Release build finishes fine. The failing tests were:

11/88 Test #11: test-cbor_default ...................***Exception: SegFault 25.61 sec
12/88 Test #12: test-cbor_all .......................***Exception: SegFault 24.98 sec
57/88 Test #57: test-msgpack_default ................***Exception: SegFault 21.97 sec
58/88 Test #58: test-msgpack_all ....................***Exception: SegFault 22.02 sec
71/88 Test #71: test-testsuites_default .............***Exception: SegFault  0.59 sec
72/88 Test #72: test-testsuites_all .................***Exception: SegFault  0.58 sec
76/88 Test #76: test-ubjson_all .....................***Exception: SegFault162.85 sec

until, again, the timeout in test-unicode_all. The Release version took 375 sec.

The question is, if the fails are coming from the Debug build, or they are masked in the Release build.

@nlohmann, it seems you are building it with x86 version of the compiler, while chances are that most of the people will use it with x64 target/host.

@risa2000
Copy link

I run the first failing test (test-cbor.exe) in the debugger. The crash cause was a stack overflow.

This is what was logged by the test logger:
json_unit-cbor_exception.txt

And this was on the call stack:
json_unit-cbor_exception-call_stack.txt

It is difficult for me to say, if the repeating exceptions at the end are just a consequence of the stack overflow, or if they are causing it.

@emmenlau
Copy link
Author

Dear @risa2000, thanks a lot for your input, its highly appreciated! So at least I'm not completely on my own with the issues I'm experiencing, even though your exact results are different.

@nlohmann, if you're not yet running tests against x64, could you enable that in your CI?

@nickaein
Copy link
Contributor

The Debug build timeouts in test-unicode_all (after 1500 sec), with other tests before the timeout passing.

For test-unicode_all, the emitted code by MSVC on Debug mode is very slow. You might want to increase its timeout or skip this test on Debug as it is currently done in AppVeyor config.

For other errors on x64, they might be caused by stack overflow. Can you increase the stack size as demonstrated in AppVeyvor config and see if that eliminate/reduce the failures?

@risa2000
Copy link

For other errors on x64, they might be caused by stack overflow. Can you increase the stack size as demonstrated in AppVeyvor config and see if that eliminate/reduce the failures?

Yesterday, Visual Studio got a new version, which apparently also updated a toolchain (compiler) and (possibly) the STL, because I am getting different errors when trying to compile the tests with clang-cl:

Visual Studio 2019 Developer v16.2.0
cmake version 3.14.19060802-MSVC_2
Microsoft (R) C/C++ Optimizing Compiler Version 19.22.27905 for x64

I recompiled the x64-Debug build, and linked it with 8MB stack (1MB is default) by using /stack:8388608 linker flag.

There are no longer failing tests, only the test-unicode_all still timeouts:

 1/88 Test  #1: test-algorithms_default .............   Passed    0.05 sec
 2/88 Test  #2: test-algorithms_all .................   Passed    0.02 sec
 3/88 Test  #3: test-allocator_default ..............   Passed    0.02 sec
 4/88 Test  #4: test-allocator_all ..................   Passed    0.01 sec
 5/88 Test  #5: test-alt-string_default .............   Passed    0.02 sec
 6/88 Test  #6: test-alt-string_all .................   Passed    0.01 sec
 7/88 Test  #7: test-bson_default ...................   Passed    0.05 sec
 8/88 Test  #8: test-bson_all .......................   Passed    0.14 sec
 9/88 Test  #9: test-capacity_default ...............   Passed    0.03 sec
10/88 Test #10: test-capacity_all ...................   Passed    0.01 sec
11/88 Test #11: test-cbor_default ...................   Passed   25.52 sec
12/88 Test #12: test-cbor_all .......................   Passed  620.88 sec
13/88 Test #13: test-class_const_iterator_default ...   Passed    0.03 sec
14/88 Test #14: test-class_const_iterator_all .......   Passed    0.01 sec
15/88 Test #15: test-class_iterator_default .........   Passed    0.02 sec
16/88 Test #16: test-class_iterator_all .............   Passed    0.01 sec
17/88 Test #17: test-class_lexer_default ............   Passed    0.02 sec
18/88 Test #18: test-class_lexer_all ................   Passed    0.01 sec
19/88 Test #19: test-class_parser_default ...........   Passed    0.46 sec
20/88 Test #20: test-class_parser_all ...............   Passed    0.45 sec
21/88 Test #21: test-comparison_default .............   Passed    0.04 sec
22/88 Test #22: test-comparison_all .................   Passed    0.02 sec
23/88 Test #23: test-concepts_default ...............   Passed    0.02 sec
24/88 Test #24: test-concepts_all ...................   Passed    0.01 sec
25/88 Test #25: test-constructor1_default ...........   Passed    0.05 sec
26/88 Test #26: test-constructor1_all ...............   Passed    0.04 sec
27/88 Test #27: test-constructor2_default ...........   Passed    0.02 sec
28/88 Test #28: test-constructor2_all ...............   Passed    0.01 sec
29/88 Test #29: test-convenience_default ............   Passed    0.02 sec
30/88 Test #30: test-convenience_all ................   Passed    0.01 sec
31/88 Test #31: test-conversions_default ............   Passed    0.06 sec
32/88 Test #32: test-conversions_all ................   Passed    0.05 sec
33/88 Test #33: test-deserialization_default ........   Passed    0.06 sec
34/88 Test #34: test-deserialization_all ............   Passed    0.05 sec
35/88 Test #35: test-element_access1_default ........   Passed    0.04 sec
36/88 Test #36: test-element_access1_all ............   Passed    0.03 sec
37/88 Test #37: test-element_access2_default ........   Passed    0.07 sec
38/88 Test #38: test-element_access2_all ............   Passed    0.05 sec
39/88 Test #39: test-inspection_default .............   Passed   84.51 sec
40/88 Test #40: test-inspection_all .................   Passed   84.70 sec
41/88 Test #41: test-items_default ..................   Passed    0.04 sec
42/88 Test #42: test-items_all ......................   Passed    0.01 sec
43/88 Test #43: test-iterators1_default .............   Passed    0.04 sec
44/88 Test #44: test-iterators1_all .................   Passed    0.02 sec
45/88 Test #45: test-iterators2_default .............   Passed    0.10 sec
46/88 Test #46: test-iterators2_all .................   Passed    0.09 sec
47/88 Test #47: test-json_patch_default .............   Passed    0.17 sec
48/88 Test #48: test-json_patch_all .................   Passed    0.14 sec
49/88 Test #49: test-json_pointer_default ...........   Passed    0.04 sec
50/88 Test #50: test-json_pointer_all ...............   Passed    0.03 sec
51/88 Test #51: test-merge_patch_default ............   Passed    0.03 sec
52/88 Test #52: test-merge_patch_all ................   Passed    0.02 sec
53/88 Test #53: test-meta_default ...................   Passed    0.02 sec
54/88 Test #54: test-meta_all .......................   Passed    0.01 sec
55/88 Test #55: test-modifiers_default ..............   Passed    0.04 sec
56/88 Test #56: test-modifiers_all ..................   Passed    0.02 sec
57/88 Test #57: test-msgpack_default ................   Passed   22.43 sec
58/88 Test #58: test-msgpack_all ....................   Passed  615.58 sec
59/88 Test #59: test-noexcept_default ...............   Passed    0.04 sec
60/88 Test #60: test-noexcept_all ...................   Passed    0.01 sec
61/88 Test #61: test-pointer_access_default .........   Passed    0.03 sec
62/88 Test #62: test-pointer_access_all .............   Passed    0.01 sec
63/88 Test #63: test-readme_default .................   Passed    0.02 sec
64/88 Test #64: test-readme_all .....................   Passed    0.01 sec
65/88 Test #65: test-reference_access_default .......   Passed    0.03 sec
66/88 Test #66: test-reference_access_all ...........   Passed    0.01 sec
67/88 Test #67: test-regression_default .............   Passed  112.37 sec
68/88 Test #68: test-regression_all .................   Passed  112.51 sec
69/88 Test #69: test-serialization_default ..........   Passed    0.04 sec
70/88 Test #70: test-serialization_all ..............   Passed    0.01 sec
71/88 Test #71: test-testsuites_default .............   Passed    5.18 sec
72/88 Test #72: test-testsuites_all .................   Passed    4.57 sec
73/88 Test #73: test-to_chars_default ...............   Passed    0.03 sec
74/88 Test #74: test-to_chars_all ...................   Passed    0.01 sec
75/88 Test #75: test-ubjson_default .................   Passed    9.38 sec
76/88 Test #76: test-ubjson_all .....................   Passed  163.08 sec
77/88 Test #77: test-udt_default ....................   Passed    0.03 sec
78/88 Test #78: test-udt_all ........................   Passed    0.01 sec
79/88 Test #79: test-unicode_default ................   Passed    0.05 sec
80/88 Test #80: test-unicode_all ....................***Timeout 1500.01 sec
81/88 Test #81: test-wstring_default ................   Passed    0.02 sec
82/88 Test #82: test-wstring_all ....................   Passed    0.01 sec
83/88 Test #83: cmake_import_configure ..............   Passed    2.14 sec
84/88 Test #84: cmake_import_build ..................   Passed    1.22 sec
85/88 Test #85: cmake_import_minver_configure .......   Passed    1.62 sec
86/88 Test #86: cmake_import_minver_build ...........   Passed    1.16 sec
87/88 Test #87: cmake_add_subdirectory_configure ....   Passed    1.71 sec
88/88 Test #88: cmake_add_subdirectory_build ........   Passed    1.21 sec

99% tests passed, 1 tests failed out of 88

Label Time Summary:
all        = 3102.66 sec*proc (41 tests)
default    = 261.23 sec*proc (41 tests)

Total Test time (real) = 3373.31 sec

The following tests FAILED:
	 80 - test-unicode_all (Timeout)
Errors while running CTest

For what concerns the timeout, I believe that running one test for more than 25 minutes (which is the currently set timeout) is not very realistic anyway (because I doubt many will run it that long), so my idea is instead to try to run it in parallel.

Unfortunately, the test is written as one monolithic test case, so some work is needed, and I may revisit it, once I have some spare time, because it bothers me to see something running so long on one thread when there are eleven more available doing nothing. The same basically goes for all the other tests which complete in two or three digit times.

One comment on x64 vs x86. When I run the debug tests on my Visual Studio 2017 setup (as described in my first post), even the debug builds did not fail (well, apart the unicode-all timeout).

So it does seem that there is some other difference between VS 2017 and VS 2019 toolchain and/or STL implementation, which lets the former pass and the latter fail (possibly on stack overflow) on x64-Debug build.

@nickaein
Copy link
Contributor

Running it parallel would be a neat idea. I'm not sure if it can be break down into several smaller tests, but that is helpful too, especially for the environments that parallelism is limited (e.g. AppVeyor).

So it does seem that there is some other difference between VS 2017 and VS 2019 toolchain and/or STL implementation.

I haven't run the tests on VS2019 and since AppVeyor hasn't added support for VS2019 yet I haven't pursued adding it to CI. Nevertheless, it would be great to get insight and make any changes (if necessary) to prepare the project for the latest compilers.

@nlohmann
Copy link
Owner

The full Unicode test is nothing that should be executed all the time. It's only something handy to run if anything is changed in the parser. I would advice against touching the code to parallelize it.

@risa2000
Copy link

Well, then, no touching the unicode :). What about touching the other tests (for parallelization)?

@nlohmann
Copy link
Owner

The binary formats tests (CBOR, MessagePack, UBJSON, and BSON) are also a bit slow. Not sure how to parallelize them, though.

@stale
Copy link

stale bot commented Aug 27, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Aug 27, 2019
@emmenlau
Copy link
Author

Actually this is still a problem for us :-(

@stale stale bot removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Aug 27, 2019
@t-b
Copy link
Contributor

t-b commented Aug 27, 2019

@emmenlau Could you post a batch/powershell script and what test output you get? It's difficult from the thread to see what exactly was done.

@stale
Copy link

stale bot commented Sep 26, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Sep 26, 2019
@stale stale bot closed this as completed Oct 3, 2019
@emmenlau
Copy link
Author

Dear @t-b , sorry for the delay but the issue still persists. I have just verified it again. Can you say what exactly I can do to help? I build the sources from a cygwin bash shell that starts a cmd prompt via the script cygwin-exe-clean-env.sh. This setup is a bit more complex but proved to work through two years and hundreds of different builds, so its quite trustworthy:

cd "/d/Debug/json-3.7.0-x64" && \
export PATH="/d/Debug/bin:/d/Tools/bin:/d/Tools/apache-ant/bin:${PATH}" && \
export CPPFLAGS="${CPPFLAGS} /DDEBUG /DWINVER=_WIN32_WINNT_WIN7 /D_WIN32_WINNT=_WIN32_WINNT_WIN7 /D_ITERATOR_DEBUG_LEVEL=0" && \
export CFLAGS="${CFLAGS} /MDd /Zi ${CPPFLAGS}" && \
export CXXFLAGS="${CXXFLAGS} /MDd /Zi ${CPPFLAGS}" && \
export LDFLAGS="${LDFLAGS} /MACHINE:X64 /DEBUG" && \
export PKG_CONFIG_PATH="/d/Debug/lib/pkgconfig:/d/Tools/lib/pkgconfig:${PKG_CONFIG_PATH}" && \
cygwin-exe-clean-env.sh cmake ../json-3.7.0 \
    -G"Ninja" \
    -DCMAKE_VERBOSE_MAKEFILE="ON" \
    -DCMAKE_PREFIX_PATH="D:/Debug;D:/Debug/lib/cmake;D:/Tools/lib/cmake" \
    -DCMAKE_INSTALL_PREFIX="D:/Debug" \
    -DCMAKE_INSTALL_LIBDIR="lib" \
    -DCMAKE_INSTALL_DATADIR="lib" \
    -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY="ON" \
    -DCMAKE_BUILD_TYPE="Debug" \
    -DCMAKE_POSITION_INDEPENDENT_CODE="ON" \
    -DBUILD_SHARED_LIBS="OFF" \
    -DJSON_BuildTest="ON" \
    -DJSON_Install="ON" \
    -DJSON_MultipleHeaders="ON" && \
cygwin-exe-clean-env.sh ninja -j8 && \
cygwin-exe-clean-env.sh ctest

And here is the output I see:

...
      Start 66: test-reference_access_all
66/88 Test #66: test-reference_access_all ...........   Passed    0.02 sec
      Start 67: test-regression_default
67/88 Test #67: test-regression_default .............***Exception: SegFault  0.04 sec
      Start 68: test-regression_all
68/88 Test #68: test-regression_all .................***Exception: SegFault  0.04 sec
      Start 69: test-serialization_default
69/88 Test #69: test-serialization_default ..........   Passed    0.02 sec
...
98% tests passed, 2 tests failed out of 88

Label Time Summary:
all        = 1614.28 sec*proc (41 tests)
default    =  47.72 sec*proc (41 tests)

Total Test time (real) = 1673.42 sec

The following tests FAILED:
	 67 - test-regression_default (SEGFAULT)
	 68 - test-regression_all (SEGFAULT)
Errors while running CTest

The exact same problem is reproduced in our setup with Visual Studio 2019.3 and 2017.9.

But I understand that this is not super helpful for you. What can I do to move this forward? Send you the executables? Run with debug output enables?

@risa2000
Copy link

@emmenlau Have you tried to increase the stack size? It apparently did help in my case (in Debug build with MSVC). Are your release builds also failing?

@emmenlau
Copy link
Author

Hi @risa2000 ! No I did not try, for us the Windows build is very "inaccessible" in the CI system cloud so I can not just tweak a Visual Studio parameter. Do you by chance know how I can set the stack size via cmake or directly in the json C++ code?

@emmenlau
Copy link
Author

Does this seem a reasonable setting for this test?

diff -Naur json-3.7.0.org/test/CMakeLists.txt json-3.7.0/test/CMakeLists.txt
--- json-3.7.0.org/test/CMakeLists.txt  2019-07-28 21:23:36.000000000 +0200
+++ json-3.7.0/test/CMakeLists.txt      2019-10-14 10:36:37.978247800 +0200
@@ -3,6 +3,10 @@
 option(JSON_NoExceptions "Build test suite without exceptions" OFF)
 option(JSON_Coverage "Build test suite with coverage information" OFF)
 
+if(MSVC)
+    set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /STACK:8388608")
+endif()
+
 if(JSON_Sanitizer)
     message(STATUS "Building test suite with Clang sanitizer")
     if(NOT MSVC)

@emmenlau
Copy link
Author

Here is a more detailed error from this test:

D:\Debug\json-3.7.0-x64\test>test-regression.exe
[doctest] doctest version is "2.3.1"
[doctest] run with "--help" for options
===============================================================================
D:\Debug\json-3.7.0\test\src\unit-regression.cpp(163):
TEST CASE:  regression tests
  issue #228 - double values are serialized with commas as decimal points

D:\Debug\json-3.7.0\test\src\unit-regression.cpp(163): FATAL ERROR: test case CRASHED: SIGSEGV - Segmentation violation signal

===============================================================================
D:\Debug\json-3.7.0\test\src\unit-regression.cpp(163):
TEST CASE:  regression tests

===============================================================================
[doctest] test cases:      1 |      0 passed |      1 failed |      1 skipped
[doctest] assertions:     76 |     76 passed |      0 failed |
[doctest] Status: FAILURE!

@emmenlau
Copy link
Author

I get reproducibly a crash with Visual Studio 2019 and 2017 in the last line of

        // check if locale is properly reset
        std::stringstream ss;
        ss.imbue(std::locale(std::locale(), new CommaDecimalSeparator));
        ss << 4712.11;
        CHECK(ss.str() == "4.712,11");

Does this mean anything to you? The exception says (in both Visual Studio versions) something along the lines of

Unhandled exception at 0x000007FEF4CB9DB7 (msvcp140d.dll) in test-regression.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF. occurred

@emmenlau
Copy link
Author

The same also with a stack size of 32MB. Is this too low or should it be sufficient?

@risa2000
Copy link

@emmenlau For which target (platform) exactly do you build the tests? It is a bit confusing mentioning SIGSEGV, cygwin and MSVC in one row.

Can you debug the build elsewhere (if not in the CI environment)? Does only the debug build fails, or also the release build?

The error "Unhandled exception..." just means that there was invalid memory access (at the address 0xffff....), but this does not help much diagnosing the problem.

8MB stack was enough for me. But if you are running it in a VM (CI) is it possible there are some other limits, imposed by the environment?

@t-b
Copy link
Contributor

t-b commented Oct 14, 2019

With ed55414 (Merge pull request #1779 from t-b/avoid-using-glob-in-cmake, 2019-10-09) and

PS E:\projekte\json> git diff .                                                                                     diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 762e5582..1656814e 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -3,6 +3,10 @@ option(JSON_Valgrind "Execute test suite with Valgrind" OFF)
 option(JSON_NoExceptions "Build test suite without exceptions" OFF)
 option(JSON_Coverage "Build test suite with coverage information" OFF)

+if(MSVC)
+    set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /STACK:8388608")
+endif()
+
 if(JSON_Sanitizer)
     message(STATUS "Building test suite with Clang sanitizer")
     if(NOT MSVC)

if I do PS E:\projekte\json\build-debug-x64> cmake -G "Visual Studio 16 2019" -A x64 ..; cmake --build . --config Debug; ctest -C Debug -V -j 12 only one tests fails and that is test-unicode due to a timeout.

PS E:\projekte\json> cl   Microsoft (R) C/C++-Optimierungscompiler Version 19.23.28106.4 für x86
Copyright (C) Microsoft Corporation. Alle Rechte vorbehalten.

Syntax: cl [ Option... ] Dateiname... [ /link Linkeroption... ]
``

@risa2000
Copy link

I built the tests the same way as @t-b did:

cmake -G "Visual Studio 16 2019" -A x64 ..
cmake --build . --config Debug

then run the tests and all passed except the UNICODE test, which I killed. In particular:

D:\Work_OSS\json\build>ctest -R test-regression -C debug
Test project D:/Work_OSS/json/build
    Start 67: test-regression_default
1/2 Test #67: test-regression_default ..........   Passed   30.08 sec
    Start 68: test-regression_all
2/2 Test #68: test-regression_all ..............   Passed   30.17 sec

@t-b
Copy link
Contributor

t-b commented Oct 15, 2019

@risa2000
Copy link

@t-b I believe the problems were only related to debug builds, while your CI builds a release one.

@t-b
Copy link
Contributor

t-b commented Oct 15, 2019

@risa2000 Good point. I've added https://ci.appveyor.com/project/nlohmann/json/build/job/bkp1lqcisew79u4k. Let's see how it finishes.

@emmenlau
Copy link
Author

emmenlau commented Oct 15, 2019

Do you have Windows development machines available so that I could send you the test files with pdb and you can check out the issue yourself?

I'm completely at a loss as to what I may be doing wrong. The lines of the crash do not make much sense with respect to an illegal memory access, because it would be inside a stringstream object that was just constructed in the same context a few lines above. So I can only think of a more generic memory corruption, or possibly an stl corruption due to a mix of different ABIs?

For what its worth, I build my sources with /D_ITERATOR_DEBUG_LEVEL=0 which is a prerequisite of https://github.com/xtensor-stack/xtensor. Do you know if this may interfere with your code?

@risa2000
Copy link

For what its worth, I build my sources with /D_ITERATOR_DEBUG_LEVEL=0 which is a prerequisite of https://github.com/xtensor-stack/xtensor. Do you know if this may interfere with your code?

FWIW, I have been building successfully an app with JSON lib and xtensor lib without this define. both Debug and Release builds, by using both MSVC and clang-cl. So why is it a prerequisite?

Could you post your compiler commands? (generated by CMake when run with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON)

@emmenlau
Copy link
Author

@emmenlau For which target (platform) exactly do you build the tests? It is a bit confusing mentioning SIGSEGV, cygwin and MSVC in one row.

Haha yes I can believe that sounds strange! However its not as uncommon as it may seem, because our CI system executes jobs via gitlab-runner that in turn needs a scriptable shell, and usually this is either one of CMD, bash or PowerShell. The actual build is run inside msbuild, ninja or devenv started via CMD /C, so the Microsoft Compiler should not see much of bash. In any case, this affects only the build, while the crashes in the tests can be reproduced in plain old CMD.

Can you debug the build elsewhere (if not in the CI environment)? Does only the debug build fails, or also the release build?

Ok this may be interesting. I just tested test-regression.exe in a release build, and in this setup the test works just fine (in the same VM).

I've just also noticed that tests are compiled with COMPILE_OPTIONS "$<$<CXX_COMPILER_ID:MSVC>:/EHsc;$<$<CONFIG:Release>:/Od>>". Since the issue I observe is raised in msvcp140d.dll, could the exception model /EHsc have something to do with it? At least this is one thing that may be more "unique" to json, and we're not seeing issues with any other code.

8MB stack was enough for me. But if you are running it in a VM (CI) is it possible there are some other limits, imposed by the environment?

No I don't think so, we have a VirtualBox with 32GB RAM and standard Windows 7 x64 install, there should be sufficient resources even for larger applications.

@emmenlau
Copy link
Author

For what its worth, I build my sources with /D_ITERATOR_DEBUG_LEVEL=0 which is a prerequisite of https://github.com/xtensor-stack/xtensor. Do you know if this may interfere with your code?

FWIW, I have been building successfully an app with JSON lib and xtensor lib without this define. both Debug and Release builds, by using both MSVC and clang-cl. So why is it a prerequisite?

As long as you're not using anything that requires xtensor iterator magic, you are fine. See however xtensor-stack/xtensor#1659 (comment) for a discussion of the problem.

Could you post your compiler commands? (generated by CMake when run with -DCMAKE_EXPORT_COMPILE_COMMANDS=ON)

Ok let me try that later today :)

@t-b
Copy link
Contributor

t-b commented Oct 15, 2019

The CI build from #1685 (comment) passed.

We are seeing a warning when compiling test-regressions.exe so maybe that is something worth exploring.

see C:\projects\json\single_include\nlohmann/json.hpp(3258,1): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data [C:\projects\json\test\test-regression.vcxproj]
C:\projects\json\single_include\nlohmann/json.hpp(3248): message : while compiling class template member function 'const std::basic_string<char,std::char_traits<char>,std::allocator<char>> &nlohmann::detail::iteration_proxy_value<nlohmann::detail::iter_impl<nlohmann::basic_json<std::map,std::vector,std::string,bool,int64_t,uint64_t,double,std::allocator,nlohmann::adl_serializer>>>::key(void) const' [C:\projects\json\test\test-regression.vcxproj]
C:\projects\json\test\src\unit-regression.cpp(1704): message : see reference to function template instantiation 'const std::basic_string<char,std::char_traits<char>,std::allocator<char>> &nlohmann::detail::iteration_proxy_value<nlohmann::detail::iter_impl<nlohmann::basic_json<std::map,std::vector,std::string,bool,int64_t,uint64_t,double,std::allocator,nlohmann::adl_serializer>>>::key(void) const' being compiled [C:\projects\json\test\test-regression.vcxproj]
C:\projects\json\test\src\unit-regression.cpp(1699): message : see reference to class template instantiation 'nlohmann::detail::iteration_proxy_value<nlohmann::detail::iter_impl<nlohmann::basic_json<std::map,std::vector,std::string,bool,int64_t,uint64_t,double,std::allocator,nlohmann::adl_serializer>>>' being compiled [C:\projects\json\test\test-regression.vcxproj].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug platform: visual studio related to MSVC state: help needed the issue needs help to proceed state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated
Projects
None yet
Development

No branches or pull requests

5 participants