Issue with hard-coded root build-dir
(Submitted by email by @Sawatzky) I hit a 'fail-to-run-from-clean-build' bug with the current hallc_replay The cause is ROOT is not correctly creating the necessary child directories under the scratch area set in the rootlogon.C file here: "gSystem->SetBuildDir("$HOME/.root_build_dir");" It creates this set of nested dirs: /home/brads/.root_build_dir/u/group/c-polhe3/Users/brads/hallc_replay/ but fails to create the './SCRIPTS/src/' under that path, so the subsequent compilation fails. (See [*] at bottom for copy/paste of error output.)
I can work around this by either
- Commenting out the "gSystem->SetBuildDir("$HOME/.root_build_dir");" line in rootlogin.C, or
- Forcing a 'flat' scratch space by adding a kTRUE arg: "gSystem->SetBuildDir("$HOME/.root_build_dir", kTRUE);"
In the case of (1), the *.so file(s) are created under the existing hallc_replay/ structure and all is well. In (2), the *.so files all crated directly under "$HOME/.root_build_dir/" with no 'extra' subdirectories.
FWIW, It looks like the rootlogin.C was added to the repo on Oct 4 2019 (commit a10b2269), so my guess is nobody tried this on a 'clean' hallc_replay repo (and the cached *.so under the standard path already existed so ROOT didn't try a rebuild of the .so...)
Anyway, I think there could be a couple potential issues using either of the above schemes on the Farm since both cases introduce a race condition when multiple farm replays fire off. Ideally the .root_build_dir scratch space should probably be in a node-local tmpdir under /tmp for farm jobs.
I'm thinking something like this in 'hallc_replay/rootlogin.C':
const char *root_build_dir=gSystem->Getenv("ROOT_BUILD_TMPDIR");
if(root_build_dir==NULL) {
// Setting isflat=kTRUE since ROOT does not create necessary subdirectories
// under some ocnditions.
gSystem->SetBuildDir("$HOME/.root_build_dir",kTRUE);
} else {
gSystem->SetBuildDir(root_build_dir,kTRUE);
}
printf("---> Setting ACLiC build directory to: %s\n\n", root_build_dir);
That will let me export $ROOT_BUILD_TMPDIR in the SWIF script (and clean the tempdir at the end of run).
What do you think?
-- Brad
[*] Example failure copy/paste:
brads@ifarm1802 [git:test_run1*] 1050% ./SCRIPTS/hcreplay -r 3994 -n 1000 -m all HMS
Running HMS replay (mode: all) for run 3994
Event range to analyze: 0 - 1000
Loading definitions and cuts from, DEF-files/definitions.json
Executing: hcana -b -q 'SCRIPTS/src/replay_hms.cxx+(3994,1000,0,"all","DEF-files/HMS/PRODUCTION/hstackana_production_all.def","DEF-files/HMS/PRODUCTION/CUTS/hstackana_production_cuts.def")'
DB_DIR set to DBASE
[ . . . ]
Processing SCRIPTS/src/replay_hms.cxx+(3994,1000,0,"all","DEF-files/HMS/PRODUCTION/hstackana_production_all.def","DEF-files/HMS/PRODUCTION/CUTS/hstackana_production_cuts.def")...
Info in <TUnixSystem::ACLiC>: creating shared library /home/brads/.root_build_dir//u/group/c-polhe3/Users/brads/hallc_replay/./SCRIPTS/src/replay_hms_cxx.so
/bin/ld: cannot open output file /home/brads/.root_build_dir//u/group/c-polhe3/Users/brads/hallc_replay/./SCRIPTS/src/replay_hms_cxx.so: No such file or directory
collect2: error: ld returned 1 exit status
Error in <ACLiC>: Compilation failed!
-- Brad Sawatzky, PhD brads@jlab.org -<>- Jefferson Lab / Hall C / C111 Ph: 757-269-5947 -<>- Fax: 757-269-5235 -<>- Pager: brads-page@jlab.org The most exciting phrase to hear in science, the one that heralds new discoveries, is not "Eureka!" but "That's funny..." -- Isaac Asimov