From owner-chemistry@ccl.net Sun Jun 22 06:21:01 2008 From: "K.Radacki K.Radacki(~)mail.uni-wuerzburg.de" To: CCL Subject: CCL:G: g03: problem with link1 Message-Id: <-37227-080622061523-7538-Iwn/6U+fkBBCqp75C0BFnw*server.ccl.net> X-Original-From: "K.Radacki" Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=ISO-8859-1 Date: Sun, 22 Jun 2008 12:15:01 +0200 MIME-Version: 1.0 Sent to CCL by: "K.Radacki" [K.Radacki|*|mail.uni-wuerzburg.de] As already some people write to me with questions about user/group/world policies.

The g03root is:
lrwxrwxrwx   1 root  root     19 Jun 21 12:43 g03 -> g03.d1-acml40_64Int
drwxr-x---   2 root  qchem 12288 Jun 21 19:10 g03.d1-acml40_64Int
and all executable are
"-rwxr-x---   1 root qchem"

I fooled around a bit more and found out that probably for my trouble is responsible system.
I took the whole  g03 directory > from machine where it was compiled (CentOS4.3) and I move it to different one (SUSE 9).
After copying  PGI/.../lib and setting LD_LIBRARY_PATH the calculation were finished successfully.

In /var/log/messages I found entry:
... kernel: l1.exe[29525]: segfault at 00000f85f11c4920 rip 000000000048051e rsp 0000007fbfffab80 error 4
I copied the binaries to another CentOS machine and observed the same result.

The CentOS machine useing kernel 2.6.9-34.ELsmp and the SUSE: 2.6.5-7.244-smp (by the way all three machines have the same hardware)

I would be very grateful for any clues
kris From owner-chemistry@ccl.net Sun Jun 22 13:21:01 2008 From: "Vincenzo Verdolino vincenzo.verdolino=nemo.unipr.it" To: CCL Subject: CCL:G: g03: problem with link1 Message-Id: <-37228-080622115008-2501-GquqI7/oWw6SJulhkwRSVg###server.ccl.net> X-Original-From: "Vincenzo Verdolino" Content-Type: multipart/alternative; boundary="----=OPENWEBMAIL_ATT_0.517380605530803" Date: Sun, 22 Jun 2008 17:52:10 +0200 MIME-Version: 1.0 Sent to CCL by: "Vincenzo Verdolino" [vincenzo.verdolino^nemo.unipr.it] This is a multi-part message in MIME format. ------=OPENWEBMAIL_ATT_0.517380605530803 Content-Type: text/plain; charset=iso-8859-1 Could you copy and paste the input and output file? Do you have this problem executing g03 as user or root? On Sun, 22 Jun 2008 12:15:01 +0200, K.Radacki K.Radacki(~)mail.uni-wuerzburg.de wrote > Sent to CCL by: "K.Radacki" [K.Radacki|*|mail.uni-wuerzburg.de] As already some people write to me with questions about user/group/world policies. > > The g03root is: > lrwxrwxrwx   1 root  root     19 Jun 21 12:43 g03 -> g03.d1-acml40_64Int > drwxr-x---   2 root  qchem 12288 Jun 21 19:10 g03.d1-acml40_64Int > and all executable are > "-rwxr-x---   1 root qchem" > > I fooled around a bit more and found out that probably for my trouble is responsible system. > I took the whole  g03 directory > from machine where it was compiled (CentOS4.3) and I move it to different one (SUSE 9). > After copying  PGI/.../lib and setting LD_LIBRARY_PATH the calculation were finished successfully. > > In /var/log/messages I found entry: > ... kernel: l1.exe[29525]: segfault at 00000f85f11c4920 rip 000000000048051e rsp 0000007fbfffab80 error 4 > I copied the binaries to another CentOS machine and observed the same result. > > The CentOS machine useing kernel 2.6.9-34.ELsmp and the SUSE: 2.6.5-7.244-smp (by the way all three machines have the same hardware) > > I would be very grateful for any clues > krishttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/chemistry/sub_unsub.shtmlhttp://www.ccl.net/spammers.txt-- Universita' degli Studi di Parma (http://www.unipr.it) ------=OPENWEBMAIL_ATT_0.517380605530803 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable

Could you copy and paste the input and output file? = Do you have this problem executing g03 as user or root?

On Sun, 22 Jun 2008 12:15:01 +0200, K.Radacki K.Rada= cki(~)mail.uni-wuerzburg.de wrote
> Sent to CCL by: "K.Radacki" [K.Radacki|*|mail.uni-wuer= zburg.de] As already some people write to me with questions about user/grou= p/world policies.
>=20
> The g03root is:
> lrwxrwxrwx=A0=A0 1 root=A0 root=A0=A0=A0=A0 19 Jun 21 12:43 = g03 -> g03.d1-acml40_64Int=20
> drwxr-x---=A0=A0 2 root=A0 qchem 12288 Jun 21 19:10 g03.d1-acml4= 0_64Int=20
>
and all executable are=20
> "-rwxr-x---=A0=A0 1 root qchem"
>=20
> I fooled around a bit more and found out that probably for my tr= ouble is responsible system.
> I took the whole=A0 g03 directory > from machine where it was= compiled (CentOS4.3) and I move it to different one (SUSE 9).
> After copying=A0 PGI/.../lib and setting LD_LIBRARY_PATH the cal= culation were finished successfully.
>=20
> In /var/log/messages I found entry:
> ... kernel: l1.exe[29525]: segfault at 00000f85f11c4920 rip = 000000000048051e rsp 0000007fbfffab80 error 4
> I copied the binaries to another CentOS machine and observed the= same result.
>=20
> The CentOS machine useing kernel 2.6.9-34.ELsmp and the SUSE: 2.= 6.5-7.244-smp (by the way all three machines have the same hardware)
>=20
> I would be very grateful for any clues
> kris -=3D This is automatically added to each message by the mai= ling script =3D- To recover the email address of the author of the message,= please change the strange characters on the top line to the _._ sign. You ca= n alsoE-mail to subs= cribers: CHEMISTRY_._ccl.net or use: http://www.ccl.net/cgi-bin/ccl/send_ccl_= messagehttp://w= ww.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.c= cl.net/chemistry/sub_unsub.shtml Before posting, check wait time at: http:/= /www.ccl.netConferences: http://server.ccl.ne= t/chemistry/announcements/conferences/ Search Messages: http://www.ccl.net/= htdig (login: ccl, Password: search) If your mail bounces from CCL with 5.7= .1 error, check:RTFI: http://www.ccl.net/c= hemistry/aboutccl/instructions/=20


--=20
Universita' degli Studi di Parma (http://www.unipr.it)=20

------=OPENWEBMAIL_ATT_0.517380605530803-- From owner-chemistry@ccl.net Sun Jun 22 17:59:01 2008 From: "K.Radacki K.Radacki__mail.uni-wuerzburg.de" To: CCL Subject: CCL:G: g03: problem with link1 Message-Id: <-37229-080622174625-20837-61ExEcgEI6Yy2M6+alOZKA=-=server.ccl.net> X-Original-From: "K.Radacki" Content-Type: multipart/alternative; boundary="------------000806070903010004010808" Date: Sun, 22 Jun 2008 23:46:01 +0200 MIME-Version: 1.0 Sent to CCL by: "K.Radacki" [K.Radacki-$-mail.uni-wuerzburg.de] This is a multi-part message in MIME format. --------------000806070903010004010808 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Vincenzo Verdolino vincenzo.verdolino=nemo.unipr.it wrote: > > *Could you copy and paste the input and output file? Do you have this > problem executing g03 as user or root?* > I've tried both as normal user from group qchem (owner of g03-directory) and as root. _Input:_ %MEM=100MB %NPROC=2 %CHK=BNH2 %SAVE #P OPT=(Z-Matr,TIGHT) POP=FULL B3LYP/6-31G(D,P) # SCF=(MAXCYCLE=150) INT=(GRID=99590) [=> B-NH2 0 1 X B 1 c2 N 2 r1 1 c90 H 3 r2 2 a2 1 c90 H 3 r2 2 a2 1 -c90 r1=1.3 r2=.9 a2=120. c2=2. c90=90. _Output:_ Entering Gaussian System, Link 0=g03 Initial command: /usr/local/SW/g03/l1.exe /tmp/Gau_8192/Gau-8203.inp -scrdir=/tmp/Gau_8192/ and additionally the last line of _/var/log/messages:_ Jun 22 23:44:09 hall14 kernel: l1.exe[8204]: segfault at 00000f85f11c48c0 rip 00000000004804ce rsp 0000007fbfffac80 error 4 The link l1.exe itself works and is generating following output: PGFIO-F-211/OPEN/unit=5/invalid file name. File name = stdin formatted, sequential access record = 0 In source file l1init.f, at line number 141 Or if I try to start directly l1.exe BNH2.com I see ~normal~ output: Default is to use a total of 2 processors: 2 via shared-memory 1 via Linda Entering Link 1 = l1.exe PID= 8256. ... ... ****************************************** Gaussian 03: AM64L-G03RevD.01 13-Oct-2005 22-Jun-2008 ****************************************** %MEM=100MB %NPROC=2 Will use up to 2 processors via shared memory. %CHK=BNH2 %SAVE Segmentation fault In between I've updated CentOS to last available from repositories kernel 2.6.9-34 -->>> 2.6.9-67.0.15 and observed no difference in behavior. I recompiled gaussian with standard blas in place of amcl all in vain. I have still an option to compile it with IFC but I never did it on opterons. I'm not going to install suse. > *On Sun, 22 Jun 2008 12:15:01 +0200, K.Radacki > K.Radacki(~)mail.uni-wuerzburg.de wrote* > > Sent to CCL by: "K.Radacki" [K.Radacki|*|mail.uni-wuerzburg.de] As > already some people write to me with questions about user/group/world > policies. > > > > The g03root is: > > lrwxrwxrwx 1 root root 19 Jun 21 12:43 g03 -> > g03.d1-acml40_64Int > > drwxr-x--- 2 root qchem 12288 Jun 21 19:10 g03.d1-acml40_64Int > > and _all_ executable are > > "-rwxr-x--- 1 root qchem" > > > > I fooled around a bit more and found out that probably for my > trouble is responsible system. > > I took the whole g03 directory > from machine where it was compiled > (CentOS4.3) and I move it to different one (SUSE 9). > > After copying PGI/.../lib and setting LD_LIBRARY_PATH the > calculation were finished successfully. > > > > In /var/log/messages I found entry: > > ... kernel: l1.exe[29525]: segfault at 00000f85f11c4920 rip > 000000000048051e rsp 0000007fbfffab80 error 4 > > I copied the binaries to another CentOS machine and observed the > same result. > > > > The CentOS machine useing kernel 2.6.9-34.ELsmp and the SUSE: > 2.6.5-7.244-smp (by the way all three machines have the same hardware) > > > > I would be very grateful for any clues > > kris -= This is automatically added to each message by the mailing > script =- To recover the email address of the author of the message, > please change the strange characters on the top line to the /a\ sign. > You can alsoE-mail to subscribers: CHEMISTRY/a\ccl.net or use:http://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/chemistry/sub_unsub.shtmlConferences: > http://server.ccl.net/chemistry/announcements/conferences/ Search > Messages: http://www.ccl.net/htdig (login: ccl, Password: search) If > your mail bounces from CCL with 5.7.1 error, check:RTFI: > http://www.ccl.net/chemistry/aboutccl/instructions/ > > > -- > Universita' degli Studi di Parma (http://www.unipr.it > ) > --------------000806070903010004010808 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Vincenzo Verdolino vincenzo.verdolino=nemo.unipr.it wrote:

Could you copy and paste the input and output file? Do you have this problem executing g03 as user or root?

I've tried both as normal user from group qchem (owner of g03-directory) and as root.

Input:
%MEM=100MB
%NPROC=2
%CHK=BNH2
%SAVE

#P OPT=(Z-Matr,TIGHT) POP=FULL B3LYP/6-31G(D,P)
#  SCF=(MAXCYCLE=150) INT=(GRID=99590)

[=> B-NH2

0 1
X
B   1  c2
N   2  r1   1   c90
H   3  r2   2   a2    1   c90
H   3  r2   2   a2    1  -c90

r1=1.3
r2=.9
a2=120.

c2=2.
c90=90.


Output:
 Entering Gaussian System, Link 0=g03
 Initial command:
 /usr/local/SW/g03/l1.exe /tmp/Gau_8192/Gau-8203.inp -scrdir=/tmp/Gau_8192/

and additionally the last line of  /var/log/messages:
Jun 22 23:44:09 hall14 kernel: l1.exe[8204]: segfault at 00000f85f11c48c0 rip 00000000004804ce rsp 0000007fbfffac80 error 4

The link l1.exe itself works and is generating following output:
PGFIO-F-211/OPEN/unit=5/invalid file name.
 File name = stdin     formatted, sequential access   record = 0
 In source file l1init.f, at line number 141


Or if I try to start directly l1.exe BNH2.com I see ~normal~ output:
Default is to use a total of   2 processors:
                                2 via shared-memory
                                1 via Linda
 Entering Link 1 = l1.exe PID=      8256.
...
...
 ******************************************
 Gaussian 03:  AM64L-G03RevD.01 13-Oct-2005
                22-Jun-2008
 ******************************************
 %MEM=100MB
 %NPROC=2
 Will use up to    2 processors via shared memory.
 %CHK=BNH2
 %SAVE
Segmentation fault


In between I've updated CentOS to last available from repositories kernel 2.6.9-34 -->>> 2.6.9-67.0.15 and observed no difference in behavior.
I recompiled gaussian with standard blas in place of amcl all in vain.

I have still an option to compile it with IFC but I never did it on opterons. I'm not going to install suse.




On Sun, 22 Jun 2008 12:15:01 +0200, K.Radacki K.Radacki(~)mail.uni-wuerzburg.de wrote
> Sent to CCL by: "K.Radacki" [K.Radacki|*|mail.uni-wuerzburg.de] As already some people write to me with questions about user/group/world policies.
>
> The g03root is:
> lrwxrwxrwx   1 root  root     19 Jun 21 12:43 g03 -> g03.d1-acml40_64Int
> drwxr-x---   2 root  qchem 12288 Jun 21 19:10 g03.d1-acml40_64Int
>
and all executable are
> "-rwxr-x---   1 root qchem"
>
> I fooled around a bit more and found out that probably for my trouble is responsible system.
> I took the whole  g03 directory > from machine where it was compiled (CentOS4.3) and I move it to different one (SUSE 9).
> After copying  PGI/.../lib and setting LD_LIBRARY_PATH the calculation were finished successfully.
>
> In /var/log/messages I found entry:
> ... kernel: l1.exe[29525]: segfault at 00000f85f11c4920 rip 000000000048051e rsp 0000007fbfffab80 error 4
> I copied the binaries to another CentOS machine and observed the same result.
>
> The CentOS machine useing kernel 2.6.9-34.ELsmp and the SUSE: 2.6.5-7.244-smp (by the way all three machines have the same hardware)
>
> I would be very grateful for any clues
> kristhe strange characters on the top line to the /a\ sign. You can alsoE-mail to subscribers: CHEMISTRY/a\ccl.net or use: http://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/chemistry/sub_unsub.shtml Before posting, check wait time at: http://www.ccl.netConferences: http://server.ccl.net/chemistry/announcements/conferences/ Search Messages: http://www.ccl.net/htdig (login: ccl, Password: search)RTFI: http://www.ccl.net/chemistry/aboutccl/instructions/


--
Universita' degli Studi di Parma (http://www.unipr.it)


--------------000806070903010004010808--