mò %U²Ic@s‚dZdkZdklZddgZdeifd„ƒYZdeifd„ƒYZe d„Z e djo e ƒndS( svHTML 2.0 parser. See the HTML 2.0 specification: http://www.w3.org/hypertext/WWW/MarkUp/html-spec/html-spec_toc.html N(sAS_ISt HTMLParsertHTMLParseErrorcBstZdZRS(s3Error raised when an HTML document can't be parsed.(t__name__t __module__t__doc__(((t$/data/zmath/lib/python2.4/htmllib.pyRs cBs0tZdZdklZdd„Zd„Zd„Zd„Zd„Z d„Z d „Z d „Z d „Z d „Zd „Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Zd„Z d„Z!d „Z"d!„Z#d"„Z$d#„Z%d$„Z&d%„Z'd&„Z(d'„Z)d(„Z*d)„Z+d*„Z,d+„Z-d,„Z.d-„Z/d.„Z0d/„Z1d0„Z2d1„Z3d2„Z4d3„Z5d4„Z6d5„Z7d6„Z8d7„Z9d8„Z:d9„Z;d:„Z<d;„Z=d<„Z>dd=„Z?d>„Z@d?„ZAd@„ZBdA„ZCdB„ZDdC„ZEdD„ZFdE„ZGdF„ZHdG„ZIdH„ZJdI„ZKdJ„ZLdK„ZMdL„ZNdM„ZOdN„ZPdO„ZQdP„ZRdQ„ZSdR„ZTdS„ZUdT„ZVdU„ZWdV„ZXdW„ZYdX„ZZdY„Z[RS(ZsÌThis is the basic HTML parser class. It supports all entity names required by the XHTML 1.0 Recommendation. It also defines handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements. (s entitydefsicCs tii||ƒ||_dS(s•Creates an instance of the HTMLParser class. The formatter parameter is the formatter instance associated with the parser. N(tsgmllibt SGMLParsert__init__tselftverboset formatter(R R R ((RRscCst|ƒ‚dS(N(Rtmessage(R R ((Rterror'scCs\tii|ƒd|_d|_d|_d|_d|_ g|_ d|_ g|_ dS(Ni( RRtresetR tNonetsavedatatisindexttitletbasetanchort anchorlisttnofillt list_stack(R ((RR*s       cCsV|idj o|i||_n/|io|ii|ƒn|ii|ƒdS(N(R RRtdataRR tadd_literal_datatadd_flowing_data(R R((Rt handle_data:s  cCs d|_dS(sêBegins saving character data in a buffer instead of sending it to the formatter object. Retrieve the stored data via the save_end() method. Use of the save_bgn() / save_end() pair may not be nested. tN(R R(R ((Rtsave_bgnEscCs9|i}d|_|ipdi|iƒƒ}n|S(sHEnds buffering character data and returns all data saved since the preceding call to the save_bgn() method. If the nofill flag is false, whitespace is collapsed to single spaces. A call to this method without a preceding call to the save_bgn() method will raise a TypeError exception. t N(R RRRRtjointsplit(R R((Rtsave_endOs    cCs+||_|io|ii|ƒndS(s}This method is called at the start of an anchor region. The arguments correspond to the attributes of the tag with the same names. The default implementation maintains a list of hyperlinks (defined by the HREF attribute for tags) within the document. The list of hyperlinks is available as the data attribute anchorlist. N(threfR RRtappend(R R"tnamettype((Rt anchor_bgn`s   cCs5|io'|idt|iƒƒd|_ndS(sØThis method is called at the end of an anchor region. The default implementation adds a textual footnote marker using an index into the list of hyperlinks created by the anchor_bgn()method. s[%d]N(R RRtlenRR(R ((Rt anchor_endns cGs|i|ƒdS(s–This method is called to handle images. The default implementation simply passes the alt value to the handle_data() method. N(R Rtalt(R tsrcR)targs((Rt handle_image{scCsdS(N((R tattrs((Rt start_html†scCsdS(N((R ((Rtend_html‡scCsdS(N((R R-((Rt start_head‰scCsdS(N((R ((Rtend_headŠscCsdS(N((R R-((Rt start_bodyŒscCsdS(N((R ((Rtend_bodyscCs|iƒdS(N(R R(R R-((Rt start_title‘scCs|iƒ|_dS(N(R R!R(R ((Rt end_title”scCs5x.|D]&\}}|djo ||_qqWdS(NR"(R-tatvR R(R R-R6R7((Rtdo_base—s  cCs d|_dS(Ni(R R(R R-((Rt do_isindexœscCsdS(N((R R-((Rtdo_linkŸscCsdS(N((R R-((Rtdo_meta¢scCsdS(N((R R-((Rt do_nextid¥scCs$|iidƒ|iidƒdS(Nith1i(R=iii(R R t end_paragrapht push_font(R R-((Rtstart_h1¬scCs!|iidƒ|iiƒdS(Ni(R R R>tpop_font(R ((Rtend_h1°scCs$|iidƒ|iidƒdS(Nith2i(RCiii(R R R>R?(R R-((Rtstart_h2´scCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rtend_h2¸scCs$|iidƒ|iidƒdS(Nith3i(RFiii(R R R>R?(R R-((Rtstart_h3¼scCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rtend_h3ÀscCs$|iidƒ|iidƒdS(Nith4i(RIiii(R R R>R?(R R-((Rtstart_h4ÄscCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rtend_h4ÈscCs$|iidƒ|iidƒdS(Nith5i(RLiii(R R R>R?(R R-((Rtstart_h5ÌscCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rtend_h5ÐscCs$|iidƒ|iidƒdS(Nith6i(ROiii(R R R>R?(R R-((Rtstart_h6ÔscCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rtend_h6ØscCs|iidƒdS(Ni(R R R>(R R-((Rtdo_pÞscCs@|iidƒ|iitttdfƒ|id|_dS(Ni(R R R>R?tAS_ISR(R R-((Rt start_preáscCs:|iidƒ|iiƒtd|idƒ|_dS(Nii(R R R>RAtmaxR(R ((Rtend_preæs cCs|i|ƒ|idƒdS(Ntxmp(R RTR-t setliteral(R R-((Rt start_xmpës cCs|iƒdS(N(R RV(R ((Rtend_xmpïscCs|i|ƒ|idƒdS(Ntlisting(R RTR-RX(R R-((Rt start_listingòs cCs|iƒdS(N(R RV(R ((Rt end_listingöscCs0|iidƒ|iitdttfƒdS(Nii(R R R>R?RS(R R-((Rt start_addressùscCs!|iidƒ|iiƒdS(Ni(R R R>RA(R ((Rt end_addressýscCs$|iidƒ|iidƒdS(Nit blockquote(R R R>t push_margin(R R-((Rtstart_blockquotescCs!|iidƒ|iiƒdS(Ni(R R R>t pop_margin(R ((Rtend_blockquotescCsA|ii|i ƒ|iidƒ|iidddgƒdS(Ntult*i(R R R>RRaR#(R R-((Rtstart_ul scCs=|io|id=n|ii|i ƒ|iiƒdS(Niÿÿÿÿ(R RR R>Rc(R ((Rtend_uls cCsm|iidƒ|io0|id\}}}}|d|d<}n d\}}|ii||ƒdS(NiiÿÿÿÿiiRf(Rfi( R R R>Rtdummytlabeltcounterttoptadd_label_data(R R-RiRlRkRj((Rtdo_lis   cCs–|ii|i ƒ|iidƒd}xL|D]D\}}|djo+t |ƒdjo|d}n|}q1q1W|ii d|dgƒdS(Ntols1.R%it.i( R R R>RRaRjR-R6R7R'R#(R R-R6RjR7((Rtstart_ols  cCs=|io|id=n|ii|i ƒ|iiƒdS(Niÿÿÿÿ(R RR R>Rc(R ((Rtend_ol(s cCs|i|ƒdS(N(R RgR-(R R-((Rt start_menu-scCs|iƒdS(N(R Rh(R ((Rtend_menu0scCs|i|ƒdS(N(R RgR-(R R-((Rt start_dir3scCs|iƒdS(N(R Rh(R ((Rtend_dir6scCs-|iidƒ|iidddgƒdS(NitdlRi(R R R>RR#(R R-((Rtstart_dl9scCs)|idƒ|io|id=ndS(Niiÿÿÿÿ(R tddpopR(R ((Rtend_dl=s  cCs|iƒdS(N(R Ry(R R-((Rtdo_dtAscCs7|iƒ|iidƒ|iidddgƒdS(NtddRi(R RyR RaRR#(R R-((Rtdo_ddDs cCsU|ii|ƒ|io7|idddjo|id=|iiƒqQndS(NiÿÿÿÿiR|(R R R>tblRRc(R R~((RRyIs   cCs|i|ƒdS(N(R tstart_iR-(R R-((Rt start_citeTscCs|iƒdS(N(R tend_i(R ((Rtend_citeUscCs|i|ƒdS(N(R tstart_ttR-(R R-((Rt start_codeWscCs|iƒdS(N(R tend_tt(R ((Rtend_codeXscCs|i|ƒdS(N(R RR-(R R-((Rtstart_emZscCs|iƒdS(N(R R(R ((Rtend_em[scCs|i|ƒdS(N(R RƒR-(R R-((Rt start_kbd]scCs|iƒdS(N(R R…(R ((Rtend_kbd^scCs|i|ƒdS(N(R RƒR-(R R-((Rt start_samp`scCs|iƒdS(N(R R…(R ((Rtend_sampascCs|i|ƒdS(N(R tstart_bR-(R R-((Rt start_strongcscCs|iƒdS(N(R tend_b(R ((Rt end_strongdscCs|i|ƒdS(N(R RR-(R R-((Rt start_varfscCs|iƒdS(N(R R(R ((Rtend_vargscCs |iitdttfƒdS(Ni(R R R?RS(R R-((RRkscCs|iiƒdS(N(R R RA(R ((RRmscCs |iittdtfƒdS(Ni(R R R?RS(R R-((RRpscCs|iiƒdS(N(R R RA(R ((RRrscCs |iitttdfƒdS(Ni(R R R?RS(R R-((RRƒuscCs|iiƒdS(N(R R RA(R ((RR…wscCs—d}d}d}xk|D]c\}}|iƒ}|djo |}n|djo |}n|djo|iƒ}qqW|i |||ƒdS(NRR"R$R%( R"R$R%R-tattrnametvaluetstriptlowerR R&(R R-R$R”R“R"R%((Rtstart_azs       cCs|iƒdS(N(R R((R ((Rtend_aˆscCs|iiƒdS(N(R R tadd_line_break(R R-((Rtdo_brscCs|iiƒdS(N(R R t add_hor_rule(R R-((Rtdo_hr’sc Cs%d}d}d}d}d}d} xÞ|D]Ö\}}|djo |}n|djo |}n|djo |}n|djo |}n|djo*yt |ƒ}WqÊt j oqÊXn|d jo*yt |ƒ} Wqt j oqXq+q+W|i |||||| ƒdS( NRs(image)italignR)tismapR*twidththeight( RR)RžR*RŸR R-R“R”tintt ValueErrorR R,( R R-R*RR”RžR“RŸR)R ((Rtdo_img—s6             cCs|i|ƒ|iƒdS(N(R RTR-t setnomoretags(R R-((Rt do_plaintext±s cCsdS(N((R ttagR-((Rtunknown_starttag·scCsdS(N((R R¦((Rtunknown_endtagºs(\RRRthtmlentitydefst entitydefsRR RRRR!R&R(R,R.R/R0R1R2R3R4R5R8R9R:R;R<R@RBRDRERGRHRJRKRMRNRPRQRRRTRVRYRZR\R]R^R_RbRdRgRhRnRqRrRsRtRuRvRxRzR{R}RyR€R‚R„R†R‡RˆR‰RŠR‹RŒRŽRR‘R’RRRRRƒR…R—R˜RšRœR£R¥R§R¨(((RRs²                                                                                c CsJdk}dk}|p|id}n|o|ddj}|o |d=n|o|d}nd}|djo |i}nFyt|dƒ}Wn/t j o#}|GdG|GH|i dƒnX|i ƒ}||ij o|iƒn|o|iƒ}n|i|iƒƒ}t|ƒ}|i|ƒ|iƒdS(Niis-ss test.htmlt-trt:(tsysR R+targvtsilenttfiletstdintftopentIOErrortmsgtexittreadRtcloset NullFormattertAbstractFormattert DumbWriterRtptfeed( R+R°R³RR®R½R±R¶R ((Rttest¾s2       t__main__( RRR RSt__all__tSGMLParseErrorRRRRR¿R(RRRÁRR¿RS((Rt?s   ÿ­ '