[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Resource Usage Format - Accounting Information Sharing Models



> -----Original Message-----
<snip> 
> Scalability if our concern too. I'm only trying to fully 
> understand the 
> problem before going too far in the development. This because 
> so far we 
> focused mainly on the banking infrastructure while the comunication 
> between Computing Element and the Accounting is still in an 
> early stage.
> We didn't perform intensive test yet, so I don't know how the system 
> behaves with jobs running on clusters with so many nodes.

My experience is that the scaleability of such a system isn't too bad if you use a decent database for the back-end. The key is to avoid OLTP as much as possible. Performing a rollup of 1,000 records at the end of a job shouldn't take more than a second at worst, and even rollups of 10k or 100k shouldn't take that long. The key, though, is to define a fixed format so that each row is a complete record of usage; CPU time, MFLOP rating, Whetstone, disk usage, memory usage, etc.
 
>  > At one time, Victor
>  > Hazelwood and I did a "back of envelope calculation" on 
> the number of
>  > records per minute to be processed. It quickly reaches a 
> significant 
> number
>  > for a large cluster (e.g., 1000's nodes) unless there is 
> an entity which
>  > does some aggregation/pre-processing so that the usage 
> records exchanged
>  > would be a single record per job. In addition, as Scott 
> pointed out, 
> these
>  > records could be large in size.
> 
> Did you plan to publish your estimations in a paper ? I think 
> it would 
> be  very useful.
> 
> Andrea
> 
> > 
> >  - Jian
> > 
> > 
> > -----Original Message-----
> > From: owner-accounts-wg@gridforum.org
> > [mailto:owner-accounts-wg@gridforum.org]On Behalf Of 
> guarise@to.infn.it
> > Sent: Monday, July 01, 2002 11:56 PM
> > To: Jackson, Scott M
> > Cc: 'MiYoung Koo'; Steven Newhouse; jian@xcerla.com;
> > accounts-wg@gridforum.org; sched-wg@gridforum.org;
> > 'npi-wg@gridforum.org'; 'andrea.guarise@to.infn.it';
> > 'catlett@mcs.anl.gov'
> > Subject: Re: Resource Usage Format - Accounting Information Sharing
> > Models
> > 
> > 
> > Jackson, Scott M wrote:
> > 
> >>Usage Record Format - Accounting Information Sharing Models
> >>
> >>In a conversation I recently had with Jian Zhang, we 
> discussed the fact
> > 
> > that
> > 
> >>there are more than one models for exchanging accounting and usage
> >>information. In order to pick the "right" one to target for 
> the emerging
> >>Usage Record Format Working Group, we would like to solicit 
> your feedback
> >>and thoughts on this matter.
> >>
> >>It is believed that it does not make sense to proscribe in 
> what format the
> >>end-application should store the usage and accounting 
> records. We believe
> > 
> > we
> > 
> >>should allow the application designer the flexibility to 
> choose whether to
> >>store this information in a relational database, a 
> directory service, or
> >>extract it out of flat files or logs. What is critical, is that an
> > 
> > external
> > 
> >>entity can request this information in a known way and be able to
> > 
> > interpret
> > 
> >>the results of that request.
> >>
> >>I believe there are at least two models for exchanging this 
> accounging
> >>information - (I) passing full accounting records around, 
> or (II) defining
> > 
> > a
> > 
> >>query syntax language. In both cases, a set of semantics 
> would need to be
> >>defined to describe the bounds of resource usage. Is it 
> more important
> > 
> > that
> > 
> >>the generator of usage records be able to package it up into a known
> >>complete record format when requested or that it be able to 
> respond to
> > 
> > query
> > 
> >>requests with the appropriate specific responses?
> >>
> >>I.	Defining a usage record format to be passed around as a 
> full record
> >>
> >>In this scenario, we could define a format that describes 
> the resources
> > 
> > used
> > 
> >>within a particular session (job), i.e.
> >>
> >>	<UsageRecord id="Job1234.0@SCCS.edu>
> >>		<Who user="scott" account="chem101"/>
> >>		<When completionDateTime="999999999"
> >>startedDateTime="999999500" submittedDateTime="999999000"/>
> >>		<Where resourceProvider="SCCS.edu"
> >>resourceConsumer="PNNL.gov"/>
> >>		<ResourcesConsumed>
> >><ProcessorSeconds>12346</ProcessorSeconds>
> >><MemorySeconds units="MBs">98765</ MemorySeconds>
> >>...
> >></ResourcesConsumed>
> >>	  </UsageRecord>
> >>
> >>When asked for this information, the whole record would be 
> returned (as an
> >>object?)
> >>
> >>Pros:
> >>	Readily supports a structured description of resource 
> utilization
> >>(like to be able to say how much of the 
> ProcessorSecondsConsumed was on
> >>which NodeTypes).
> >>
> >>Cons:
> >>	These records would contain all of the accounting-relevant
> >>information collected by the resource provider and would 
> likely be very
> >>large and inefficient to pass around as a whole.
> >>	It might also be difficult to require the usage record 
> generator (a
> >>batch system etc.) to package the information into a 
> structured record
> >>format
> >>
> >>II.	Defining a usage query syntax language
> >>
> >>In this scenario, it is the query language that is well-defined. The
> > 
> > result
> > 
> >>that you get depends upon your question. You can ask for 
> all or part of a
> >>record, or potentially even specify the format you want the 
> response to be
> >>in.
> >>
> >>Unlike the above case, we do not specify what the 
> full-record format be,
> >>rather how to ask for some or all of the information about 
> a session. We
> > 
> > do
> > 
> >>not pass around records, we respond to queries.
> >>
> >><Request object="UsageRecord" action="query">
> >>	<get>JobId</get>
> >><get>UserName</get>
> >>	<get>AccountName</get>
> >>	<get>ProcessorSecondsConsumed</get>
> >>	<get units="MBs">MemorySecondsConsumed</get>
> >>	<where name="JobId">Job1234.0@SCCS.edu</where>
> >></UsageRecordQueryRequest>
> >>
> >><Response>
> >>	<UsageRecord>
> >>		<JobId>Job1234.0@SCCS.edu</JobId>
> >>		<UserName>scott</UserName>
> >>		<AccountName>chem101</AccountName>
> >>		
> <ProcessorSecondsConsumed>12345</ProcessorSecondsConsumed>
> >>		<MemorySecondsConsumed>98765</MemorySecondsConsumed>
> >>	</UsageRecord>
> >></Response>
> >>
> >>If your request had no <get> statements indicating the 
> fields you wanted
> >>returned, it would give you all the available fields, for all of the
> > 
> > records
> > 
> >>selected in your request. What is well-defined here are the 
> semantics of
> > 
> > the
> > 
> >>usage fields you can request about a session. A query 
> language like QUILT
> >>could even be used to return the response in any format in which you
> >>desired, XML, HTML (ready for GUI consumption), pretty-printed etc.
> >>
> >>Pros:
> >>
> >>	It is very easy to get exactly the information you 
> want, suitable
> >>for your own consumption without having to pull all fields 
> that may not be
> >>relevant to your request
> >>
> >>Cons:
> >>
> >>	It may be harder to support structured accounting information
> >>
> >>To help identify the right model, I feel it would be 
> helpful to identify
> >>some very specific use cases in which this usage/accounting 
> information
> >>might need to be passed between software entities. I recall someone
> > 
> > pointing
> > 
> >>out as an example that the TeraGrid project might need to 
> share cycles and
> >>hence accounting information with people in the European 
> DataGrid project.
> >>Could anyone identify two specific software components that 
> would need to
> >>exchange accounting information and what questions they 
> would like to ask
> > 
> > of
> > 
> >>the other side to help identify whether we should be 
> talking about tossing
> >>complete records around or responding to informational 
> requests in a well
> >>known way?
> >>
> >>This whole question of whether to support structured accounting
> > 
> > information
> > 
> >>presents some hard problems (perhaps), since structuring 
> may be required
> > 
> > in
> > 
> >>arbitrary ways, and very differently in different cases (i.e. some
> > 
> > resources
> > 
> >>have parent-child relationships in some architectural 
> designs but the
> >>reverse child-parent relationship in others). If we choose 
> to require a
> >>structured-resource usage record format, then will we be imposing a
> >>significant burden for supporters of the protocol?
> >>
> >>Scott Jackson (Pacific Northwest National Laboratory)
> >>
> > 
> > 
> > Hi,
> > 
> > Personally, in DataGrid, we are currently following the 
> first approach
> > to the problem, mainly because in our architecture the 
> "sensor" system
> > that gathers information about jobs on the Computing 
> Element pushes the
> >   usage information to the Accounting System when the job 
> is completed.
> > 
> > Clearly this approach is very simple but has the 
> disadvantages already
> > mentioned.
> > 
> > It would be interesting to have an Idea of how many usage records a
> > standard job needs, more or less, to be correctly 
> accounted, since this
> > information can help deciding among the two approaches.
> > 
> > As a little help in starting the discussion, I thin we must 
> separate the
> > information that identify the job from the usage records, 
> as an example
> > in DataGrid we identify the job with the following information:
> > 
> >   - DGJobId   (Job identificative for the job)
> >   - Job Submission Timestamp  (time of submission in GMT)
> >   - User X509 certificate subject (used to unambiguously 
> identify the
> > user owning the job)
> >   - Resource X509 certificate subject (used to identify the 
> Computing
> > Element that run the job)
> >   - Resource's HLR x509 cert_subject (Identifies the bank 
> branch (that
> > we call HLR) that stores the account for the Computing Element)
> >   - Resource's PA url (The URL of the Price Authority 
> responsible for
> > assigning the price to the Computing Element)
> > 
> > Then there are the usage records, and here the needs may be 
> very different.
> > 
> > We currently use CPU_TIME only, but only because we are still in
> > development phase. In production phase we plan to use more 
> parameters
> > such as memory, storage etc...but our needs won't be much more than
> > these, since we use commodity PCs in our clusters.
> > 
> > So I can estimate that we won't use more than 15-25 
> (including the job
> > identification) records to describe a job (hope to be 
> right!), but we
> > didn't fully investigate the problem so far.
> > 
> > Cheers,
> > 
> > Andrea Guarise
> > 
> > 
> > --
> >    Andrea Guarise
> > 
> >    Istituto Nazionale di Fisica Nucleare
> >    Sezione di Torino, Centro di Calcolo
> > 
> >    Via Pietro Giuria 1, Torino, ITALY
> > 
> >    Voice: +39116707474  FAX: 0116680328
> >    E-mail : andrea.guarise@to.infn.it
> > 
> 
> 
> 
> -- 
>    Andrea Guarise
> 
>    Istituto Nazionale di Fisica Nucleare
>    Sezione di Torino, Centro di Calcolo
> 
>    Via Pietro Giuria 1, Torino, ITALY
> 
>    Voice: +39116707474  FAX: 0116680328
>    E-mail : andrea.guarise@to.infn.it
> 
>