Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCX ""MCX ERROR(-2):can not load the specified config file" error #111

Open
m-planck opened this issue Sep 29, 2020 · 9 comments
Open

MCX ""MCX ERROR(-2):can not load the specified config file" error #111

m-planck opened this issue Sep 29, 2020 · 9 comments

Comments

@m-planck
Copy link
Contributor

m-planck commented Sep 29, 2020

After setting up a pwlink connection to a remote ssh server, which is running MCX Studio under Windows 10 2004 x64, a test job will terminate with the following error

""-- Executing Simulation --"
MCX ERROR(-2):can not load the specified config file in unit mcx_utils.c:1074
"-- Task completed --"

Input to reproduce, the quicktest job:

"-- Command: --"
mcx --session quicktest --input "'{ ""Session"" : { ""Photons"" : 1.0000000000000000E+007, ""RNGSeed"" : 29012392, ""ID"" : ""quicktest"", ""DoMismatch"" : 0, ""DoNormalize"" : 1, ""DoPartialPath"" : 1, ""DoSaveSeed"" : 0, ""DoSaveRef"" : 0, ""OutputType"" : ""X"" }, ""Domain"" : { ""OriginType"" : 1, ""LengthUnit"" : 1.0000000000000000E+000, ""Media"" : [{ ""mua"" : 0.0000000000000000E+000, ""mus"" : 0.0000000000000000E+000, ""g"" : 1.0000000000000000E+000, ""n"" : 1.0000000000000000E+000 }, { ""mua"" : 5.0000000000000001E-003, ""mus"" : 1.0000000000000000E+000, ""g"" : 1.0000000000000000E-002, ""n"" : 1.0000000000000000E+000 }], ""Dim"" : [60, 60, 60] }, ""Optode"" : { ""Detector"" : [{ ""Pos"" : [2.9000000000000000E+001, 1.9000000000000000E+001, 0.0000000000000000E+000], ""R"" : 1.0000000000000000E+000 }, { ""Pos"" : [2.9000000000000000E+001, 3.9000000000000000E+001, 0.0000000000000000E+000], ""R"" : 1.0000000000000000E+000 }, { ""Pos"" : [1.9000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], ""R"" : 1.0000000000000000E+000 }, { ""Pos"" : [3.9000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], ""R"" : 1.0000000000000000E+000 }], ""Source"" : { ""Pos"" : [29, 29, 0], ""Dir"" : [0, 0, 1], ""Type"" : ""pencil"" } }, ""Forward"" : { ""T0"" : 0.0000000000000000E+000, ""T1"" : 5.0000000000000001E-009, ""Dt"" : 5.0000000000000001E-009 }, ""Shapes"" : [{ ""Grid"" : { ""Tag"" : 1, ""Size"" : [60, 60, 60] } }, { ""Name"" : ""cube60"" }] }'" --root MCXOutput/mcxsessions/quicktest --outputformat nii --gpu 10000000 --autopilot 1 --photon 10000000 --normalize 1 --save2pt 1 --reflect 0 --savedet 1 --unitinmm 1.00 --seed 29012392 --saveseed 0 --specular 0 --skipradius -2 --array 0 --dumpmask 0 --repeat 1 --savedetflag DP --maxdetphoton 10000000 --bc aaaaaa
Remote Command: plink -pw %PASSWORD% [redacted]
EXEPATH=C:\Program Files\MCXStudio\plink.exe
EXEPATH=C:\Program Files\MCXStudio\MCXSuite\mcx\bin\mcx.exe

The problem seems to be a false parsing in the input section, using the same command in a local shell leads to the same error. If the "' is replaced with """ the job is run without error from a shell

> mcx --session quicktest --input """{ \""Session\"" : { \""Photons\"" : 1.0000000000000000E+007, \""RNGSeed\"" : 29012392, \""ID\"" : \""quicktest\"", \""DoMismatch\"" : 0, \""DoNormalize\"" : 1, \""DoPartialPath\"" : 1, \""DoSaveSeed\"" : 0, \""DoSaveRef\"" : 0, \""OutputType\"" : \""X\"" }, \""Domain\"" : { \""OriginType\"" : 1, \""LengthUnit\"" : 1.0000000000000000E+000, \""Media\"" : [{ \""mua\"" : 0.0000000000000000E+000, \""mus\"" : 0.0000000000000000E+000, \""g\"" : 1.0000000000000000E+000, \""n\"" : 1.0000000000000000E+000 }, { \""mua\"" : 5.0000000000000001E-003, \""mus\"" : 1.0000000000000000E+000, \""g\"" : 1.0000000000000000E-002, \""n\"" : 1.0000000000000000E+000 }], \""Dim\"" : [60, 60, 60] }, \""Optode\"" : { \""Detector\"" : [{ \""Pos\"" : [2.9000000000000000E+001, 1.9000000000000000E+001, 0.0000000000000000E+000], \""R\"" : 1.0000000000000000E+000 }, { \""Pos\"" : [2.9000000000000000E+001, 3.9000000000000000E+001, 0.0000000000000000E+000], \""R\"" : 1.0000000000000000E+000 }, { \""Pos\"" : [1.9000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], \""R\"" : 1.0000000000000000E+000 }, { \""Pos\"" : [3.9000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], \""R\"" : 1.0000000000000000E+000 }], \""Source\"" : { \""Pos\"" : [29, 29, 0], \""Dir\"" : [0, 0, 1], \""Type\"" : \""pencil\"" } }, \""Forward\"" : { \""T0\"" : 0.0000000000000000E+000, \""T1\"" : 5.0000000000000001E-009, \""Dt\"" : 5.0000000000000001E-009 }, \""Shapes\"" : [{ \""Grid\"" : { \""Tag\"" : 1, \""Size\"" : [60, 60, 60] } }, { \""Name\"" : \""cube60\"" }] }""" --root MCXOutput/mcxsessions/quicktest --outputformat nii --gpu 10000000 --autopilot 1 --photon 10000000 --normalize 1 --save2pt 1 --reflect 0 --savedet 1 --unitinmm 1.00 --seed 29012392 --saveseed 0 --specular 0 --skipradius -2 --array 0 --dumpmask 0 --repeat 1 --savedetflag DP --maxdetphoton 10000000 --bc aaaaaa
> ###############################################################################
> #                      Monte Carlo eXtreme (MCX) -- CUDA                      #
> #          Copyright (c) 2009-2020 Qianqian Fang <q.fang at neu.edu>          #
> #                             http://mcx.space/                               #
> #                                                                             #
> # Computational Optics & Translational Imaging (COTI) Lab- http://fanglab.org #
> #   Department of Bioengineering, Northeastern University, Boston, MA, USA    #
> ###############################################################################
> #    The MCX Project is funded by the NIH/NIGMS under grant R01-GM114365      #
> ###############################################################################
> $Rev::0a8ea8$ v2020 $Date::2020-09-16 19:55:53 -07$ by $Author::Qianqian Fang $
> ###############################################################################
> - variant name: [Fermi] compiled by nvcc [7.5] with CUDA [7050]
> - compiled with: RNG [xorshift128+] with Seed Length [4]
> 
> GPU=1 (GeForce GTX 1080 Ti) threadph=174 extra=22144 np=10000000 nthread=57344 maxgate=1 repetition=1
> initializing streams ...        init complete : 0 ms
> requesting 1280 bytes of shared memory
> launching MCX simulation for time window [0.00e+000ns 5.00e+000ns] ...
> simulation run# 1 ...
> kernel complete:        286 ms
> retrieving fields ...   detected 30014 photons, total: 30014    transfer complete:      306 ms
> normalizing raw data ...        source 1, normalization factor alpha=20.000000
> data normalization complete : 310 ms
> saving data to file ... saving data complete : 312 ms
> 
> simulated 10000000 photons (10000000) with 57344 threads (repeat x1)
> MCX simulation speed: 36231.88 photon/ms
> total simulated energy: 10000000.00     absorbed: 17.69319%
> (loss due to initial specular reflection is excluded in the total)
@fangq
Copy link
Owner

fangq commented Sep 29, 2020

just did a test, it seems to work ok.

my server is a Ubuntu 16.04 box with 2x TitanV GPU. The PATH variable is set in ~/.bashrc to point to where mcx executable is located.

The host is running Windows 10 with a different GPU (1050Ti). When I chose the plink command and enable Run remote command, and click on the "GPU" button, it initially ask you if you want to store the cache. I typed 'y' in the User input and clicked Send, it hanged for the first time. But after restarting mcxstudio, it was able to retrieve the GPU info on the server correctly. So, it looks like the server ssh key was cached.

Then I ran a simulation, it also worked ok, see my screenshot below.

mcxstudio_remote

@fangq
Copy link
Owner

fangq commented Sep 29, 2020

Looking at the source code regarding the json data after --input, I do have this line to replace " by \" on Windows

https://github.com/fangq/mcx/blob/master/mcxstudio/mcxgui.pas#L3014-L3016

not sure why it did not get replaced in your case - maybe this is locale related?

@m-planck
Copy link
Contributor Author

m-planck commented Sep 30, 2020

Sorry, I think the escape characters where lost here somewhere between copy and paste and forum formatting.

mcx --session remotetest --input "'{ \""Session\"" : { \""Photons\"" : 1.0000000000000000E+006, \""RNGSeed\"" : 1648335518, \""ID\"" : \""remotetest\"", \""DoMismatch\"" : 1, \""DoNormalize\"" : 1, \""DoPartialPath\"" : 1, \""DoSaveSeed\"" : 0, \""DoSaveRef\"" : 0, \""OutputType\"" : \""X\"" }, \""Domain\"" : { \""OriginType\"" : 1, \""LengthUnit\"" : 1.0000000000000000E+000, \""Media\"" : [{ \""mua\"" : 0.0000000000000000E+000, \""mus\"" : 0.0000000000000000E+000, \""g\"" : 1.0000000000000000E+000, \""n\"" : 1.0000000000000000E+000 }, { \""mua\"" : 5.0000000000000001E-003, \""mus\"" : 1.0000000000000000E+000, \""g\"" : 1.0000000000000000E-002, \""n\"" : 1.3700000000000001E+000 }], \""Dim\"" : [60, 60, 60], \""MediaFormat\"" : \""byte\"" }, \""Optode\"" : { \""Detector\"" : [{ \""Pos\"" : [2.4000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], \""R\"" : 1.0000000000000000E+000 }], \""Source\"" : { \""Pos\"" : [29, 29, 0], \""Dir\"" : [0, 0, 1], \""Type\"" : \""pencil\"" } }, \""Forward\"" : { \""T0\"" : 0.0000000000000000E+000, \""T1\"" : 5.0000000000000001E-009, \""Dt\"" : 5.0000000000000001E-009 }, \""Shapes\"" : [{ \""Grid\"" : { \""Tag\"" : 1, \""Size\"" : [60, 60, 60] } }] }'" --root MCXOutput/mcxsessions/remotetest --outputformat nii --gpu 1 --autopilot 1 --photon 1000000 --normalize 1 --save2pt 1 --reflect 1 --savedet 1 --unitinmm 1.00 --seed 1648335518 --saveseed 0 --specular 0 --skipradius -2 --array 0 --dumpmask 0 --repeat 1 --savedetflag DP --maxdetphoton 10000000

I do not know much Pascal. Maybe you could explain the program flow. So , if you create a remote job in MCX Studio, first a json file is written? Then the file is parsed to a function and escape characters inserted to send over by ssh?

So I think the parser of the json is working as intended. But the --input string is not right concatenated. A single quote is used here. See line 3021, not easy to spot but two single quotes, not a double quote
param.Add(''''+Trim(inputjson)+'''')

I tested with MS Windows Powershell, MS CMD, MS Terminal and pwlink shell:
The --input section should only be enclosed with one double quotation
mcx --session remotetest --input "{ \""Session\"" :....................
Then it was working in all the mentioned shells, however needs more testing, e.g. if there is a lenght limit of a parameter you can pass within a shell

@m-planck
Copy link
Contributor Author

PS: I also do not understand https://github.com/fangq/mcx/blob/master/mcxstudio/mcxgui.pas#L3013
inputjson:=StringReplace(SaveJSONConfig(jsonfile),'"','"',[rfReplaceAll]);
For me it replaces nothing?

@fangq
Copy link
Owner

fangq commented Oct 4, 2020

also, I am not sure if this difference is a result of windows version differences. The lab server I tested is running on Windows 10 Enterprise, Version 1709. We do not avidly update this machine because we want to keep it stable. what is your windows version?

@m-planck
Copy link
Contributor Author

m-planck commented Oct 6, 2020

@fangq Your screenshot has some sensitive information, please check

@m-planck
Copy link
Contributor Author

m-planck commented Oct 6, 2020

@fangq
I tested on
Client MS Windows 10 2004 (20H1) x64 DE and Windows 7 x64 US, plink 0.74 x64
Server MS Windows 10 2004 (20H1) x64 running built in SSH server OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.5 DE and US

Working command line for Windows 10 and 7 and local execution in a cmd shell

mcx.exe --session remotetest --input "{ \"Session\" : { \"Photons\" : 1.0000000000000000E+006, \"RNGSeed\" : 1648335518, \"ID\" : \"remotetest\", \"DoMismatch\" : 1, \"DoNormalize\" : 1, \"DoPartialPath\" : 1, \"DoSaveSeed\" : 0, \"DoSaveRef\" : 0, \"OutputType\" : \"X\" }, \"Domain\" : { \"OriginType\" : 1, \"LengthUnit\" : 1.0000000000000000E+000, \"Media\" : [{ \"mua\" : 0.0000000000000000E+000, \"mus\" : 0.0000000000000000E+000, \"g\" : 1.0000000000000000E+000, \"n\" : 1.0000000000000000E+000 }, { \"mua\" : 5.0000000000000001E-003, \"mus\" : 1.0000000000000000E+000, \"g\" : 1.0000000000000000E-002, \"n\" : 1.3700000000000001E+000 }], \"Dim\" : [60, 60, 60], \"MediaFormat\" : \"byte\" }, \"Optode\" : { \"Detector\" : [{ \"Pos\" : [2.4000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], \"R\" : 1.0000000000000000E+000 }], \"Source\" : { \"Pos\" : [29, 29, 0], \"Dir\" : [0, 0, 1], \"Type\" : \"pencil\" } }, \"Forward\" : { \"T0\" : 0.0000000000000000E+000, \"T1\" : 5.0000000000000001E-009, \"Dt\" : 5.0000000000000001E-009 }, \"Shapes\" : [{ \"Grid\" : { \"Tag\" : 1, \"Size\" : [60, 60, 60] } }] }" --root C:\ProgramData\MCXStudio\MCXOutput\mcxsessions\remotetest --outputformat nii --gpu 1 --autopilot 1 --photon 1000000 --normalize 1 --save2pt 1 --reflect 1 --savedet 1 --unitinmm 1.00 --seed 1648335518 --saveseed 0 --specular 0 --skipradius -2 --array 0 --dumpmask 0 --repeat 1 --savedetflag DP --maxdetphoton 10000000

Notice: No single quote ' for the --input parameter, every double quote " trailing with backslash. The mcx.exe windows built is expecting the \"

Using pwlink the above command can be entered in an interactive Windows 10 shell. If the pwlink command is given in a one-liner, also the backslash and double quote must be escaped separately. So every \ is changed for \\ and " for \"

This is working for me
plink -no-antispoof -pw xx -ssh user@xx -P xx mcx.exe --session remotetest --input \"{ \\\"Session\\\" : { \\\"Photons\\\" : 1.0000000000000000E+006, \\\"RNGSeed\\\" : 1648335518, \\\"ID\\\" : \\\"remotetest\\\", \\\"DoMismatch\\\" : 1, \\\"DoNormalize\\\" : 1, \\\"DoPartialPath\\\" : 1, \\\"DoSaveSeed\\\" : 0, \\\"DoSaveRef\\\" : 0, \\\"OutputType\\\" : \\\"X\\\" }, \\\"Domain\\\" : { \\\"OriginType\\\" : 1, \\\"LengthUnit\\\" : 1.0000000000000000E+000, \\\"Media\\\" : [{ \\\"mua\\\" : 0.0000000000000000E+000, \\\"mus\\\" : 0.0000000000000000E+000, \\\"g\\\" : 1.0000000000000000E+000, \\\"n\\\" : 1.0000000000000000E+000 }, { \\\"mua\\\" : 5.0000000000000001E-003, \\\"mus\\\" : 1.0000000000000000E+000, \\\"g\\\" : 1.0000000000000000E-002, \\\"n\\\" : 1.3700000000000001E+000 }], \\\"Dim\\\" : [60, 60, 60], \\\"MediaFormat\\\" : \\\"byte\\\" }, \\\"Optode\\\" : { \\\"Detector\\\" : [{ \\\"Pos\\\" : [2.4000000000000000E+001, 2.9000000000000000E+001, 0.0000000000000000E+000], \\\"R\\\" : 1.0000000000000000E+000 }], \\\"Source\\\" : { \\\"Pos\\\" : [29, 29, 0], \\\"Dir\\\" : [0, 0, 1], \\\"Type\\\" : \\\"pencil\\\" } }, \\\"Forward\\\" : { \\\"T0\\\" : 0.0000000000000000E+000, \\\"T1\\\" : 5.0000000000000001E-009, \\\"Dt\\\" : 5.0000000000000001E-009 }, \\\"Shapes\\\" : [{ \\\"Grid\\\" : { \\\"Tag\\\" : 1, \\\"Size\\\" : [60, 60, 60] } }] }\" --root C:\ProgramData\MCXStudio\MCXOutput\mcxsessions\remotetest --outputformat nii --gpu 1 --autopilot 1 --photon 1000000 --normalize 1 --save2pt 1 --reflect 1 --savedet 1 --unitinmm 1.00 --seed 1648335518 --saveseed 0 --specular 0 --skipradius -2 --array 0 --dumpmask 0 --repeat 1 --savedetflag DP --maxdetphoton 10000000

image

plink starts a cmd shell on the remote maschine and passes the arguments of the command, removing the leading backslash. This also work from a Win7 cmd shell with an English windows version, so I am pretty sure it is not related with the region setting of the client.

On the remote Windows 10 PC I used process explorer to log what command line parameters are really passed to the executable. You find it in the screenshot

image

PS1: If there is a local job the batch file uses the local .json file with --input file.json without any issue
PS2: In the video 5_How_to_run_mcx_on_remote_GPUs.swf the --input starts even differently with '{ \"Session\" :

`

@fangq
Copy link
Owner

fangq commented Oct 7, 2020

@fangq Your screenshot has some sensitive information, please check

oops. the screenshot is deleted and password changed, thanks. did not check carefully.

Client MS Windows 10 2004 (20H1) x64 DE and Windows 7 x64 US, plink 0.74 x64
Server MS Windows 10 2004 (20H1) x64 running built in SSH server OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.5 DE and US

ok, I see the problem. You are using a Windows server. I admit that I have never tried that. All my tests so far have been using a Linux server to run the remote commands. I think the handling of the quotation mark escape will be differ by your ssh server (and its default shell).

On my windows machine, I have an ssh server installed via OpenSSH-Win64

https://github.com/PowerShell/Win32-OpenSSH/releases

Screenshot 2020-10-07 17 32 46

when running the below command from a Linux terminal, I have

fangq@taote:~$ ssh fangq@winserver echo "$SHELL"
/bin/bash

fangq@taote:~$ ssh fangq@winserver uname -a
MSYS_NT-10.0 server 2.8.0(0.310/5/3) 2017-04-02 13:38 x86_64 Msys

if I use the escape string printed from mcxstudio, I can get the correct json string, which behave similarly to a linux server

fangq@taote:~$ ssh fangq@winsever echo "'{ \""Session\"":1 }'"
{ "Session":1 }

fangq@taote:~$ ssh fangq@linuxserver echo "'{ \""Session\"":1 }'"
{ "Session":1 }

can you try this on your win64-ssh server and see what you get? my suspicion is that your windows ssh server does not use bash as the default shell, but PS or cmd, which can not parse "" properly

https://stackoverflow.com/questions/7760545/escape-double-quotes-in-parameter

I think the escape format I used is related to this SO thread

https://stackoverflow.com/a/31413730/4271392

@m-planck
Copy link
Contributor Author

m-planck commented Oct 8, 2020

@fangq
OpenSSH_for_Windows_7.7p1 is the current version for the Windows 10 2004 included sshd, which can be added from the Windows component store.
image
It is actively developed and security maintained by MS, so security fixes come with Windows Update

In a default installation it uses the cmd shell, to change it needs manual adjustments
https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_server_configuration

You manually installed OpenSSH, what is your version with ssh -V?
You manually changed the shell executed by OpenSSH.
For the ported bash, you installed MSYS2 or Mingw32 and probably not Mingw64 or cygwin? Definitely the bash was separately installed on your Windows machine and you modiefied OpenSSH to point to the specific bash.

As there is Git for Windows installed on the remote machineI tried with the ported git bash. In a powershell a Registry value is added to point to the bash
New-ItemProperty -Path "HKLM:\SOFTWARE\OpenSSH" -Name DefaultShell -Value "C:\Program Files\Git\bin\bash.exe" -PropertyType String -Force

Now, plink opens via openSSH a bash shell, and bash calls mcx. But with this configuration there is even more filtering, now also single quotes are filtered and more double quotes. From MCXStudio, on the Windows 10 remote server the actual commandline parameters are
C:\Program Files\MCXStudio\MCXSuite\mcx\bin\mcx.exe" --session oct2 --input "{ Session : { Photons :
So the situation is worse.
The git bash is
`
$ echo "$SHELL"
/usr/bin/bash
$ uname -a
MINGW64_NT-10.0 xxx 2.8.0(0.310/5/3) 2017-04-02 13:38 x86_64 Msys

$ echo "'{ ""Session"":1 }'"
'{ "Session":1 }'
`

and from the client in a cmd shell
`plink -no-antispoof -pw xx-ssh xx echo "'{ ""Session"":1 }'"

{ Session:1 }`

and from the client in a git bash shell
` plink -no-antispoof -pw xx -ssh xx echo "'{ ""Session"":1 }'"

{ Session:1 }
ssh xx echo "'{ ""Session"":1 }'"
{ Session:1 }
`
As Windows 10 starts to include a Linux sub system I could try with WSL2. But I see no point that there will be any difference. Quotes and backslash are filtered and need to be escaped for any involved shell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants