[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Resource Usage Format - Accounting Information Sharing Models
- To: <guarise@to.infn.it>, "Jian Zhang" <jian@xcerla.com>
- Subject: RE: Resource Usage Format - Accounting Information Sharing Models
- From: "Jim Nasby" <decibel@ud.com>
- Date: Tue, 2 Jul 2002 11:45:53 -0500
- Cc: "Jackson, Scott M" <Scott.Jackson@pnl.gov>, "MiYoung Koo" <mkoo@nas.nasa.gov>, "Steven Newhouse" <sjn5@doc.ic.ac.uk>, <accounts-wg@gridforum.org>, <sched-wg@gridforum.org>, <npi-wg@gridforum.org>, <andrea.guarise@to.infn.it>, <catlett@mcs.anl.gov>
- Disposition-Notification-To: "Jim Nasby" <decibel@ud.com>
- Sender: owner-sched-wg@gridforum.org
- Thread-Index: AcIhp2zzE09vGlbFQv2nd3YvgBNS9AAP4LHw
- Thread-Topic: Resource Usage Format - Accounting Information Sharing Models
> -----Original Message-----
<snip>
> Scalability if our concern too. I'm only trying to fully
> understand the
> problem before going too far in the development. This because
> so far we
> focused mainly on the banking infrastructure while the comunication
> between Computing Element and the Accounting is still in an
> early stage.
> We didn't perform intensive test yet, so I don't know how the system
> behaves with jobs running on clusters with so many nodes.
My experience is that the scaleability of such a system isn't too bad if you use a decent database for the back-end. The key is to avoid OLTP as much as possible. Performing a rollup of 1,000 records at the end of a job shouldn't take more than a second at worst, and even rollups of 10k or 100k shouldn't take that long. The key, though, is to define a fixed format so that each row is a complete record of usage; CPU time, MFLOP rating, Whetstone, disk usage, memory usage, etc.
> > At one time, Victor
> > Hazelwood and I did a "back of envelope calculation" on
> the number of
> > records per minute to be processed. It quickly reaches a
> significant
> number
> > for a large cluster (e.g., 1000's nodes) unless there is
> an entity which
> > does some aggregation/pre-processing so that the usage
> records exchanged
> > would be a single record per job. In addition, as Scott
> pointed out,
> these
> > records could be large in size.
>
> Did you plan to publish your estimations in a paper ? I think
> it would
> be very useful.
>
> Andrea
>
> >
> > - Jian
> >
> >
> > -----Original Message-----
> > From: owner-accounts-wg@gridforum.org
> > [mailto:owner-accounts-wg@gridforum.org]On Behalf Of
> guarise@to.infn.it
> > Sent: Monday, July 01, 2002 11:56 PM
> > To: Jackson, Scott M
> > Cc: 'MiYoung Koo'; Steven Newhouse; jian@xcerla.com;
> > accounts-wg@gridforum.org; sched-wg@gridforum.org;
> > 'npi-wg@gridforum.org'; 'andrea.guarise@to.infn.it';
> > 'catlett@mcs.anl.gov'
> > Subject: Re: Resource Usage Format - Accounting Information Sharing
> > Models
> >
> >
> > Jackson, Scott M wrote:
> >
> >>Usage Record Format - Accounting Information Sharing Models
> >>
> >>In a conversation I recently had with Jian Zhang, we
> discussed the fact
> >
> > that
> >
> >>there are more than one models for exchanging accounting and usage
> >>information. In order to pick the "right" one to target for
> the emerging
> >>Usage Record Format Working Group, we would like to solicit
> your feedback
> >>and thoughts on this matter.
> >>
> >>It is believed that it does not make sense to proscribe in
> what format the
> >>end-application should store the usage and accounting
> records. We believe
> >
> > we
> >
> >>should allow the application designer the flexibility to
> choose whether to
> >>store this information in a relational database, a
> directory service, or
> >>extract it out of flat files or logs. What is critical, is that an
> >
> > external
> >
> >>entity can request this information in a known way and be able to
> >
> > interpret
> >
> >>the results of that request.
> >>
> >>I believe there are at least two models for exchanging this
> accounging
> >>information - (I) passing full accounting records around,
> or (II) defining
> >
> > a
> >
> >>query syntax language. In both cases, a set of semantics
> would need to be
> >>defined to describe the bounds of resource usage. Is it
> more important
> >
> > that
> >
> >>the generator of usage records be able to package it up into a known
> >>complete record format when requested or that it be able to
> respond to
> >
> > query
> >
> >>requests with the appropriate specific responses?
> >>
> >>I. Defining a usage record format to be passed around as a
> full record
> >>
> >>In this scenario, we could define a format that describes
> the resources
> >
> > used
> >
> >>within a particular session (job), i.e.
> >>
> >> <UsageRecord id="Job1234.0@SCCS.edu>
> >> <Who user="scott" account="chem101"/>
> >> <When completionDateTime="999999999"
> >>startedDateTime="999999500" submittedDateTime="999999000"/>
> >> <Where resourceProvider="SCCS.edu"
> >>resourceConsumer="PNNL.gov"/>
> >> <ResourcesConsumed>
> >><ProcessorSeconds>12346</ProcessorSeconds>
> >><MemorySeconds units="MBs">98765</ MemorySeconds>
> >>...
> >></ResourcesConsumed>
> >> </UsageRecord>
> >>
> >>When asked for this information, the whole record would be
> returned (as an
> >>object?)
> >>
> >>Pros:
> >> Readily supports a structured description of resource
> utilization
> >>(like to be able to say how much of the
> ProcessorSecondsConsumed was on
> >>which NodeTypes).
> >>
> >>Cons:
> >> These records would contain all of the accounting-relevant
> >>information collected by the resource provider and would
> likely be very
> >>large and inefficient to pass around as a whole.
> >> It might also be difficult to require the usage record
> generator (a
> >>batch system etc.) to package the information into a
> structured record
> >>format
> >>
> >>II. Defining a usage query syntax language
> >>
> >>In this scenario, it is the query language that is well-defined. The
> >
> > result
> >
> >>that you get depends upon your question. You can ask for
> all or part of a
> >>record, or potentially even specify the format you want the
> response to be
> >>in.
> >>
> >>Unlike the above case, we do not specify what the
> full-record format be,
> >>rather how to ask for some or all of the information about
> a session. We
> >
> > do
> >
> >>not pass around records, we respond to queries.
> >>
> >><Request object="UsageRecord" action="query">
> >> <get>JobId</get>
> >><get>UserName</get>
> >> <get>AccountName</get>
> >> <get>ProcessorSecondsConsumed</get>
> >> <get units="MBs">MemorySecondsConsumed</get>
> >> <where name="JobId">Job1234.0@SCCS.edu</where>
> >></UsageRecordQueryRequest>
> >>
> >><Response>
> >> <UsageRecord>
> >> <JobId>Job1234.0@SCCS.edu</JobId>
> >> <UserName>scott</UserName>
> >> <AccountName>chem101</AccountName>
> >>
> <ProcessorSecondsConsumed>12345</ProcessorSecondsConsumed>
> >> <MemorySecondsConsumed>98765</MemorySecondsConsumed>
> >> </UsageRecord>
> >></Response>
> >>
> >>If your request had no <get> statements indicating the
> fields you wanted
> >>returned, it would give you all the available fields, for all of the
> >
> > records
> >
> >>selected in your request. What is well-defined here are the
> semantics of
> >
> > the
> >
> >>usage fields you can request about a session. A query
> language like QUILT
> >>could even be used to return the response in any format in which you
> >>desired, XML, HTML (ready for GUI consumption), pretty-printed etc.
> >>
> >>Pros:
> >>
> >> It is very easy to get exactly the information you
> want, suitable
> >>for your own consumption without having to pull all fields
> that may not be
> >>relevant to your request
> >>
> >>Cons:
> >>
> >> It may be harder to support structured accounting information
> >>
> >>To help identify the right model, I feel it would be
> helpful to identify
> >>some very specific use cases in which this usage/accounting
> information
> >>might need to be passed between software entities. I recall someone
> >
> > pointing
> >
> >>out as an example that the TeraGrid project might need to
> share cycles and
> >>hence accounting information with people in the European
> DataGrid project.
> >>Could anyone identify two specific software components that
> would need to
> >>exchange accounting information and what questions they
> would like to ask
> >
> > of
> >
> >>the other side to help identify whether we should be
> talking about tossing
> >>complete records around or responding to informational
> requests in a well
> >>known way?
> >>
> >>This whole question of whether to support structured accounting
> >
> > information
> >
> >>presents some hard problems (perhaps), since structuring
> may be required
> >
> > in
> >
> >>arbitrary ways, and very differently in different cases (i.e. some
> >
> > resources
> >
> >>have parent-child relationships in some architectural
> designs but the
> >>reverse child-parent relationship in others). If we choose
> to require a
> >>structured-resource usage record format, then will we be imposing a
> >>significant burden for supporters of the protocol?
> >>
> >>Scott Jackson (Pacific Northwest National Laboratory)
> >>
> >
> >
> > Hi,
> >
> > Personally, in DataGrid, we are currently following the
> first approach
> > to the problem, mainly because in our architecture the
> "sensor" system
> > that gathers information about jobs on the Computing
> Element pushes the
> > usage information to the Accounting System when the job
> is completed.
> >
> > Clearly this approach is very simple but has the
> disadvantages already
> > mentioned.
> >
> > It would be interesting to have an Idea of how many usage records a
> > standard job needs, more or less, to be correctly
> accounted, since this
> > information can help deciding among the two approaches.
> >
> > As a little help in starting the discussion, I thin we must
> separate the
> > information that identify the job from the usage records,
> as an example
> > in DataGrid we identify the job with the following information:
> >
> > - DGJobId (Job identificative for the job)
> > - Job Submission Timestamp (time of submission in GMT)
> > - User X509 certificate subject (used to unambiguously
> identify the
> > user owning the job)
> > - Resource X509 certificate subject (used to identify the
> Computing
> > Element that run the job)
> > - Resource's HLR x509 cert_subject (Identifies the bank
> branch (that
> > we call HLR) that stores the account for the Computing Element)
> > - Resource's PA url (The URL of the Price Authority
> responsible for
> > assigning the price to the Computing Element)
> >
> > Then there are the usage records, and here the needs may be
> very different.
> >
> > We currently use CPU_TIME only, but only because we are still in
> > development phase. In production phase we plan to use more
> parameters
> > such as memory, storage etc...but our needs won't be much more than
> > these, since we use commodity PCs in our clusters.
> >
> > So I can estimate that we won't use more than 15-25
> (including the job
> > identification) records to describe a job (hope to be
> right!), but we
> > didn't fully investigate the problem so far.
> >
> > Cheers,
> >
> > Andrea Guarise
> >
> >
> > --
> > Andrea Guarise
> >
> > Istituto Nazionale di Fisica Nucleare
> > Sezione di Torino, Centro di Calcolo
> >
> > Via Pietro Giuria 1, Torino, ITALY
> >
> > Voice: +39116707474 FAX: 0116680328
> > E-mail : andrea.guarise@to.infn.it
> >
>
>
>
> --
> Andrea Guarise
>
> Istituto Nazionale di Fisica Nucleare
> Sezione di Torino, Centro di Calcolo
>
> Via Pietro Giuria 1, Torino, ITALY
>
> Voice: +39116707474 FAX: 0116680328
> E-mail : andrea.guarise@to.infn.it
>
>