[orbit-user] OMF5.2 load image failure in outdoor orbit nodes
Tong Jin
tjin at eden.rutgers.edu
Tue Apr 27 23:31:02 EDT 2010
Hi, Christoph,
Thanks for your reply. The result you see is correct. Jack has fixed the
outdoor node 102 last weekend, so it works now. I am not sure if other nodes
work or not. But for node 102, I have tried it yesterday and got the same
correct result as yours.
Thanks.
Tong
On Tue, Apr 27, 2010 at 10:20 PM, Christoph Dwertmann <lists.cd at gmail.com>wrote:
> Hi Tong!
>
> I just tried to reproduce the error message you saw. I SSH'd into
> outdoor.orbit-lab.org and ran:
>
> cdw at console:~$ omf-5.2 load [1,102] tongjin_all.ndz
> Imaging nodes: '[1,102]' with image 'tongjin_all.ndz'
> (Domain: default from hostname)
> (Timeout: 800 sec.)
> INFO NodeHandler: init OMF Experiment Controller 5.2.408
> INFO NodeHandler: init Experiment ID:
> outdoor.orbit-lab.org_2010_04_27_22_11_14
> INFO NodeHandler: Web interface available at: http://10.40.0.10:4000
> INFO Experiment: load system:exp:stdlib
> INFO property.resetDelay: value = 210 (Fixnum)
> INFO property.resetTries: value = 1 (Fixnum)
> INFO Experiment: load system:exp:imageNode
> INFO property.nodes: value = [1, 102] (Array)
> INFO property.image: value = "tongjin_all.ndz" (String)
> INFO property.domain: value = nil (NilClass)
> INFO property.outpath: value = "/tmp" (String)
> INFO property.timeout: value = 800 (Fixnum)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO stdlib: Waiting for nodes (Up/Down/Total): 0/1/1 - (still down:
> n_1_102)
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 690
> sec.
> INFO whenAll: *: 'status[@value='UP']' fires
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 680
> sec.
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 670
> sec.
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 660
> sec.
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 650
> sec.
> INFO exp: Progress(0/0/1): 0/0/0 min(n_1_102)/avg/max (110) - Timeout: 640
> sec.
> INFO exp: Progress(0/0/1): 10/10/10 min(n_1_102)/avg/max (110) -
> Timeout: 630 sec.
> INFO exp: Progress(0/0/1): 20/20/20 min(n_1_102)/avg/max (110) -
> Timeout: 620 sec.
> INFO exp: Progress(0/0/1): 30/30/30 min(n_1_102)/avg/max (110) -
> Timeout: 610 sec.
> INFO exp: Progress(0/0/1): 30/30/30 min(n_1_102)/avg/max (110) -
> Timeout: 600 sec.
> INFO exp: Progress(0/0/1): 40/40/40 min(n_1_102)/avg/max (110) -
> Timeout: 590 sec.
> INFO exp: Progress(0/0/1): 50/50/50 min(n_1_102)/avg/max (110) -
> Timeout: 580 sec.
> INFO exp: Progress(0/0/1): 60/60/60 min(n_1_102)/avg/max (110) -
> Timeout: 570 sec.
> INFO exp: Progress(0/0/1): 60/60/60 min(n_1_102)/avg/max (110) -
> Timeout: 560 sec.
> INFO exp: Progress(0/0/1): 80/80/80 min(n_1_102)/avg/max (110) -
> Timeout: 550 sec.
> INFO exp: Progress(1/0/1): 100/100/100 min()/avg/max (110) - Timeout: 540
> sec.
> INFO exp: -----------------------------
> INFO exp: Imaging Process Done
> INFO exp: - 1 node(s) successfully imaged - See the topology file:
> '/tmp/outdoor.orbit-lab.org_2010_04_27_22_11_14_topo_active.rb'
> INFO exp: -----------------------------
> INFO Experiment: DONE!
> INFO NodeHandler: Shutting down experiment, please wait...
> INFO NodeHandler: Shutdown flag is set - Turning Off the resources
> INFO run: Experiment outdoor.orbit-lab.org_2010_04_27_22_11_14
> finished after 4:22
>
> Is this the command you ran? Can you please give more details under
> which circumstances you encountered the error? Can you also please
> open a second SSH session to repository2 and run a "tail -f
> /var/log/omf-aggmgr-5.2.log" there and post the messages you see there
> while you encouter the ServiceException?
>
> Thank you!
> Kind regards,
>
> Christoph Dwertmann
>
>
> On Fri, Apr 23, 2010 at 11:40 AM, Tong Jin <tjin at eden.rutgers.edu> wrote:
> > Hi,
> > I tried to load images on ourdoor orbit nodes using the command "omf-5.2
> > load". But it doesn't work all the time. Could anyone check that please?
> > I put the failure information here, and hope it helps.
> > Thanks.
> >
> > Imaging nodes: '[1,102]' with image 'tongjin_all.ndz'
> > (Domain: default from hostname)
> > (Timeout: 800 sec.)
> > INFO NodeHandler: init OMF Experiment Controller 5.2.388
> > INFO NodeHandler: init Experiment ID:
> > outdoor.orbit-lab.org_2010_04_20_16_46_34
> > INFO NodeHandler: Web interface available at: http://10.40.0.10:4000
> > INFO Experiment: load system:exp:stdlib
> > INFO property.resetDelay: value = 210 (Fixnum)
> > INFO property.resetTries: value = 1 (Fixnum)
> > INFO Experiment: load system:exp:imageNode
> > INFO property.nodes: value = [1, 102] (Array)
> > INFO property.image: value = "tongjin_all.ndz" (String)
> > INFO property.domain: value = nil (NilClass)
> > INFO property.outpath: value = "/tmp" (String)
> > INFO property.timeout: value = 800 (Fixnum)
> > FATAL service_call: Exception: ServiceException
> > (
> http://repository2:5052/pxe/setBootImageNS?domain=outdoor.orbit-lab.org&ns=[[1,102]]
> )
> > INFO NodeHandler: Shutdown flag is set - Turning Off the resources
> > FATAL service_call: Exception: ServiceException
> > (
> http://repository2:5052/pxe/clearBootImageNS?domain=outdoor.orbit-lab.org&ns=[[1,102]]
> )
> > /usr/share/omf-expctl-5.2/omf-expctl/nodeHandler.rb:278:in
> `service_call':
> > ServiceException (ServiceException)
> > from /usr/share/omf-expctl-5.2/omf-expctl/node/nodeSet.rb:510:in
> > `setPxeEnvMulti'
> > from /usr/share/omf-expctl-5.2/omf-expctl/node/nodeSet.rb:475:in
> > `pxeImage'
> > from
> >
> /usr/share/omf-expctl-5.2/omf-expctl/node/rootNodeSetPath.rb:85:in`pxeImage'
> > from /usr/share/omf-expctl-5.2/omf-expctl/nodeHandler.rb:748:in
> > `shutdown'
> > from /usr/share/omf-expctl-5.2/omf-expctl.rb:71
> >
> > Tong
> >
> > _______________________________________________
> > orbit-user mailing list
> > orbit-user at orbit-lab.org
> > http://orbit-lab.org/cgi-bin/mailman/listinfo/orbit-user
> > to unsubscribe login to the orbit webpage and the "Preferences" option
> will
> > appear just above the top menu bar on Orbit web page, choose "Account"
> and
> > set your mailing list membership to "none".
> >
> _______________________________________________
> orbit-user mailing list
> orbit-user at orbit-lab.org
> http://orbit-lab.org/cgi-bin/mailman/listinfo/orbit-user
> to unsubscribe login to the orbit webpage and the "Preferences" option will
> appear just above the top menu bar on Orbit web page, choose "Account" and
> set your mailing list membership to "none".
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://orbit-lab.org/pipermail/orbit-user/attachments/20100427/0bbcd675/attachment.htm>
More information about the orbit-user
mailing list