QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: general online.effbot.org discussion (2005)
Branched from topic: general online.effbot.org discussion
Printer-Friendly Page
All messages            0-95 of 95        
About these ads
Who | When
Messagessort recent-bottom    (not accepting new messages)
Fredrik Lundh  95
12-30-2005 05:47 AM ET (US)
Fredrik LundhPerson was signed in when posted  94
12-17-2005 03:25 AM ET (US)
Edited by author 12-23-2005 08:37 AM
(update: this bug is fixed in cElementTree 1.0.4)

Chris: this looks like a refcount bug in the default entity handler. Here's a patch:

=== cElementTree.c
==================================================================
--- cElementTree.c (revision 1128)
+++ cElementTree.c (local)
@@ -1953,7 +1953,6 @@
             res = PyObject_CallFunction(self->handle_data, "O", value);
         else
             res = NULL;
- Py_DECREF(value);
         Py_XDECREF(res);
     } else {
         PyErr_Format(

thanks! /F
Chris OldsPerson was signed in when posted  93
12-16-2005 07:42 PM ET (US)
I'm using ElementTree 1.2.6 with Python 2.4 on WinXP. With ElementTree.py, I can define entities by setting the entity dict in the XMLTreeBuilder object. With cElementTree, I get different behavior depending on whether or not a DOCTYPE is present in the file. If I have a doctype, parsing works, but I get a segfault when the program finishes. If I do not have a doctype, I get 'undefined entity' exceptions, but no segfault



doc = """<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE patent-application-publication SYSTEM "pap-v15-2001-01-31.dtd" []>
<patent-application-publication>
<subdoc-abstract>
<paragraph id="A-0001" lvl="0">A new and distinct cultivar of Begonia plant named &lsquo;BCT9801BEG&rsquo;.</paragraph>
</subdoc-abstract>
</patent-application-publication>"""

#from elementtree import ElementTree as et
import cElementTree as et

entities = {
 u'rsquo' : u"&#x2019;", # <!--=single quotation mark, right -->
 u'lsquo' : u"&#x2018;", # <!--=single quotation mark, left -->
}

parser = et.XMLTreeBuilder()
parser.entity.update(entities)
parser.feed(doc)
t = parser.close()
print t.find('.//paragraph').text
Fredrik LundhPerson was signed in when posted  92
12-13-2005 11:54 AM ET (US)
Working on it, working on it. (the technical issues are no problem, but where should it be placed?)
Manuzhai  91
12-13-2005 04:30 AM ET (US)
Great that ETree has landed in the stdlib. I hope you can sort out the issues with expat soon so that we can also use cElementTree.
Fredrik LundhPerson was signed in when posted  90
12-04-2005 04:20 PM ET (US)
"As you have figured out the encoding would it be safer to use: codes.open(filename, mode, encoding)"

Agreed. I will update the article.

Thanks! /F
mark_m  89
12-01-2005 12:35 PM ET (US)
I would be concerned that the cElementTree encoding workaround is unsafe.

The line that concerns me ..
    while 1:
        s = f.read(1000000) # <-- THIS LINE
        if not s:

If you are reading in a file that represents a single character as multiple bytes then you could have a problem if a character starts on the 1,000,000 character boundary, but it's other byte(s) are in the next chunk.

As you have figured out the encoding would it be safer to use:
  codes.open(filename, mode, encoding)

(I have never used this function so not sure if it solves the problem)

Thanks
  Mark
   88
11-25-2005 10:35 PM ET (US)
Deleted by topic administrator 11-26-2005 04:38 AM
Ralph Meijer  87
11-24-2005 09:02 AM ET (US)
Referring to: http://online.effbot.org/2005_11_01_archive.htm#20051124.

1) I don't think the result of simple_eval('((...))') should be one-tuples of one-tuples (etc) of the empty tuple, but rather simply the empty tuple. The other brackets are just that.

2) cpython has a hardcoded *parser* limit of 32 nested expression levels. This has bitten me too while generating python from other languages. When writing python manually you wouldn't normally run into this limit.
Fredrik Lundh  86
11-23-2005 05:13 PM ET (US)
Yup. That's the "there is at least one more, but I'll return to that one later" buglet I mentioned in my last post (you're not the first one noticing this).

It's a bit embarrasing; my only defense is that the example was derived from a piece of code designed to parse the output from "repr()".

</F>
Marius Gedminas  85
11-23-2005 04:49 PM ET (US)
Speaking of bugs in the simple iterator-based parser, it also accepts invalid constructs like

>>> simple_eval("(1 2 3 4)")
(1, 2, 3, 4)


(/me looks around for a Preview button, shrugs, then submits)
Fredrik LundhPerson was signed in when posted  84
11-17-2005 09:34 AM ET (US)
Douglas: asyncore+queue might be the best way to solve your problem. I'll see if I can dig up some xmlrpc-over-asyncore sample code this weekend.

For some asyncore code to build upon, see:

http://effbot.org/downloads/#effbot.org

and

http://effbot.org/zone/effnews-1.htm
http://effbot.org/zone/effnews-2.htm
Fredrik LundhPerson was signed in when posted  83
11-17-2005 08:55 AM ET (US)
Alfred: I've fixed the mycontroller typo. The second script may yield a "TclError: wrong # args" error if you click on the canvas, but the posted version seems to work as expected if you drag lines.

Stewart: In my experience, being able to focus on one problem at a time is a great way to reduce the complexity, not increase it. As for other advantages, I hope to be able to address them in a followup article (feel free to look at the WCK version of this article for some hints).
Stewart Midwinter  82
11-16-2005 07:25 PM ET (US)
Following up on Alfred's post... I read the post on tkController. I understand that it brings to Tkinter the same separation that exists in WCK: one set of classes to draw the widgets, and one set to handle the events. What's missing for me is an understanding of why you would want to do this. Other than it being an interesting intellectual exercise, what particular benefits are there that offset the probable greater complexity?

thanks
S
Alfred Milgrom  81
11-16-2005 06:15 PM ET (US)
Thank you for the very interesting Tkinter Tricks (http://online.effbot.org/#20051113 posted on November 13). I just wanted to alert you to a simple error in the program as shown on the blog:

The first example defines a class MyController, but the code later on refers to it as ClickController.

As well, when I run the second example I encounter an error, but have not tracked it down. The refactored code presented as an alternative runs without error.

Thanks for your good work,
Alfred Milgrom (fredm [at] smartypantsco [dot] com)
Douglas Beethe  80
11-10-2005 04:43 PM ET (US)
Edited by author 11-10-2005 04:46 PM
Regarding /m75 and /m76 -- I left out one important point. The socket.setdefaulttimeout(...) works OK for a single-threaded app, but falters in a multi-threaded app where the threads have differing timeout constraints. This implies getting down to the select() level -- do you happen to know of any examples which might have extended your xmlrpclib code base to support multi-threaded clients with independent timeouts? Perhaps something melding asyncore-like capability?
Fredrik LundhPerson was signed in when posted  79
10-30-2005 11:07 AM ET (US)
Edited by author 10-30-2005 11:39 AM
Yeah, it sure makes you wonder why "a world-leading expert in search-engine optimization" has to sue people who happens to mention their name, in order to prevent those posts from appearing in the first 10 google hits. Shouldn't a little optimization be enough?
Horse  78
10-27-2005 11:11 AM ET (US)
"Våra konsulter har flera års erfarenhet av Sökmotor Positionering," says the H & N Consulting Web site. Once again the term "consultant" seems to be used somewhat euphemistically.
   77
10-26-2005 03:23 PM ET (US)
Deleted by topic administrator 10-27-2005 09:03 PM
Fredrik LundhPerson was signed in when posted  76
10-26-2005 06:45 AM ET (US)
Edited by author 10-26-2005 06:46 AM
Hi Douglas,

You should be able to fix this in your application code simply by setting the global socket timeout before you issue the requests. Look for "setdefaulttimeout" on this page for details:

http://docs.python.org/lib/module-socket.html

</F>
Douglas Beethe  75
10-18-2005 10:35 AM ET (US)
Fredrik,

Having a bit of trouble with XMLRPC in an environment where the underlying TCP connections are a bit flaky. The actual connections are over an OpenVPN circuit which does an admirable job of reconnecting on top of an unreliable transport. In this less than ideal environment, we occasionally see a case where a call invoked through an instance of xmlrpclib.ServerProxy(...) will simply hang, never to return, even though other instances on other machines which are interacting with the same remote XMLRPC server continue to be serviced (over different transport lines of course).

One way around this would be to have an optional timeout capability on the client side which would allow the client to break off the attempt and try later. Are you aware of any implementations for this sort of behaviour with xmlrpclib?

DB
   74
09-03-2005 12:03 PM ET (US)
Deleted by topic administrator 11-04-2005 10:06 PM
Ian Bicking  73
08-24-2005 11:32 AM ET (US)
I seem to remember that Tom Lord had disappeared from the community for some time before he popped up with Arch, though he had also worked with the FSF on various projects in the past. I vaguely remember that he created the Guile Scheme interpreter from SCM; that was a challenged project from the beginning, and before it really was usable for its primary purpose he passed on maintainership (I don't know if that mattered, but I think that project was doomed from the beginning regardless). From what I can tell, it really is for personal reasons.
Aloys  72
07-20-2005 02:57 AM ET (US)
Hi Fredrik,

I find your cElementTree a really great tool, and I use it in a current project.
Unfortunately this library doesn't support the weak referencing.
I updated your C code to make it work, as explained in the Python Manual, and now everything is perfect!
If you're interested in the changes I've done, I can send them to you.

Thanks!
Olle Jonsson  71
05-25-2005 06:41 AM ET (US)
Oh, I came by the blog, via why the lucky stiff's comment on the Twisted mailing list posting on "violence humour".

Thanks for the Swedish news/commentary coverage. I live in Copenhagen, so your blog was a very welcome find.
Stewart Midwinter  70
05-13-2005 12:02 PM ET (US)
What's the smilie symbol for a sheepish grin? I did in fact look through the changes list for the latest version of PIL, but failed to notice the addition of the 'width' option for line drawing. Thanks for adding that!

S
Fredrik LundhPerson was signed in when posted  69
05-13-2005 11:51 AM ET (US)
Edited by author 05-13-2005 11:52 AM
Did you see this comment in the 1.1.5 CHANGES document:

Added width option to ImageDraw.line(). The current implementation works best for straight lines; it does not support line joins, so polylines won't look good.

(An improved line join algorithm would be welcome, of course)
midtoadPerson was signed in when posted  68
05-11-2005 08:16 PM ET (US)
Fredrik, what would be the best way for me to go about adding a 'width' parameter to the line_draw method in PIL? I'd like to be able to draw a line wider than a single pixel. I'm building an open-source web app using CherryPy, SQLObject, PIL and ElementTree (among others).

thanks
S
Fredrik LundhPerson was signed in when posted  67
05-04-2005 04:18 AM ET (US)
Comments are not part of the XML information model (an XML processor is free to ignore them), and ElementTree isn't really designed for applications that need to create XML for human consumption, so leaving them out wasn't a very hard decision.

If you need comment and PI support, you can use a custom parser. See:

http://effbot.org/zone/element-pi.htm
Costas Malamas  66
04-28-2005 04:45 AM ET (US)
I am wondering why ElementTree doesn't support preserving comments in the XML structure. I am guessing complexity isn't the issue here.

Here's my patch to enable it:
1126a1127
> parser.CommentHandler = self._comment
1186a1188,1192
> def _comment(self, text):
> self._target.start(Comment,{})
> self._target.data(self._fixtext(text))
> self._target.end(Comment)
>
gervin23  65
04-28-2005 03:05 AM ET (US)
Edited by author 04-28-2005 03:07 AM
i've been using elementtree and sgmlop for a couple days now and must say i love the speed and simplicity, nice job.

one question however, when using the sample code for grabbing anchor tags found on http://effbot.org/zone/sgmlop-patterns.htm, links to urls with ampersands (i.e. this page) get truncated at the first occurrence. the rest of the href data seems to be getting trapped inside resolve_entityref method.

for example, sending output to the console and using using http://foo.com?a=1&b=2&c=3 as a url i get (in order of execution):
b=2&c=3 #from resolve_entityref()
http://foo.com?a=1 #from finish_starttag()

i'm using sgmlop-1.1.1-20040207.

any ideas?
Jesse Andrews  64
04-17-2005 10:05 PM ET (US)
Saw your comment about needing to de-gasbag a planet, so I did something about it . . .

I wrote a program that creates a GreaseMonkey user script that removes whoever you want from a planet (it only works within firefox). If you are interested check http://overstimulate.com/reorbit ... If you have fixed it already sorry for bothering you ;)

btw, thanks for all the python resources! I get to work full time in python thanks to you, mark pilgram, ... (too many to list)
Casey Whitelaw  63
04-01-2005 06:40 PM ET (US)
Great to see the new PIL release, thanks a lot. I use PIL for all kinds of little toys (the latest is an image compositor thingo at http://projects.caseyporn.com/multimatic/ ), and it's amazingly useful.
max khesin  62
03-31-2005 09:55 AM ET (US)
Kudos on the PIL release!
One question has been bothering me for a while: why not fix the TIFF group 4 issue - I mean this question has come up on the Net since 2000 and you pointed to a patch here
http://mail.python.org/pipermail/image-sig/2003-July/002354.html
quite a while ago.
Just to throw in some perspecive: TIFF group4 is billions of dicuments. It is the standard image repository format used for document storage.
And I hate patching :).
Fredrik LundhPerson was signed in when posted  61
03-30-2005 02:04 AM ET (US)
Btw, for those asking about comment support in ElementTree, this page shows one (unsupported) way to deal with comments and processing instructions:

http://effbot.org/zone/element-pi.htm
Fredrik LundhPerson was signed in when posted  60
03-24-2005 06:59 AM ET (US)
Now that plist files are appearing all over the place, shouldn't plistlib be made available on all platforms?
Just van Rossum  59
03-23-2005 11:39 AM ET (US)
Regarding plist: for completeness' sake (but not a fun example of ElementTree) there's also plistlib.py in plat-mac in the Python std lib.
Fredrik LundhPerson was signed in when posted  58
03-02-2005 03:22 PM ET (US)
Edited by author 03-02-2005 03:23 PM
Alain, the current ElementTree parser ignores comments (they're not really part of the infoset, just like Python comments are not part of the Python program). Also see:

http://www.quicktopic.com/28/H/v2dA7ee7u55Jx/p9.10.1
Alain  57
02-25-2005 06:38 AM ET (US)
I am experimenting with ElementTree. I find it marvellous but i miss something. While parsing an XML file, the comment info gets lost. Is there any way to preserve it? I really need the comments !

Alain
infidel  56
02-24-2005 12:11 PM ET (US)
Just curious, did you intend for ElementTree to be a pun on "Elementary", or was it just a happy coincidence?
John Mudd  55
02-23-2005 09:19 AM ET (US)
I looked at the elementTree example:
http://effbot.org/zone/element-index.htm

root = Element("html")
head = SubElement(root, "head")
title = SubElement(head, "title")
title.text = "Page Title"



I looked at the Hierarchical data objects (HDO) example:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/286150

model=HierarchicalData()
# model.person is contstruted on the fly:
model.person.surname = "uwe"
model.person.name = "schmitt"
model.number = 1


I like how HDO builds the tree "on the fly". Is this available with elementTree? If not, is it worth considering?

John
exmu  54
02-21-2005 05:26 AM ET (US)
What can I say... Duh. Sorry.

PS : Bravo for this, I think the blatant missing feature of (c)ElementTree now is that it's not included in the standard distribution.

(or maybe that's already the case and I'm missing sth in the documentation again!)
Fredrik LundhPerson was signed in when posted  53
02-21-2005 05:21 AM ET (US)
Yes (that's what "This is an add-on" and "You also need a recent version of the standard ElementTree library" is trying to say ;-)
exmu  52
02-21-2005 05:14 AM ET (US)
Hi, I have a problem using cElementTree : importing it (on python 2.3, win32) raises the problem :

ImportError: No module named ElementTree

Does that mean cElementTree requires ElementTree ?
Fredrik LundhPerson was signed in when posted  51
02-17-2005 02:49 PM ET (US)
As most other container types in Python, the Element structure doesn't include parent pointers. The usual way to work around this is to operate on parents rather than children. If that's too difficult, creating a child-to-parent mapper for a given tree can be done in a single line:

parent = dict((c, p) for p in tree.getiterator() for c in p).get

Parent is now a callable that returns the parent for a given child, or None if the parent is not known.

for e in tree.findall(".//tag"):
    p = parent(e)
    ...
Jason W.  50
02-16-2005 08:01 PM ET (US)
Ok, maybe I am just missing something major...but I have looked all over and tried many things.

How in the heck does one go about getting the parent of a node when using elementtree? I am parsing HTML and need to grab the parent of a node that I have found (through root.findall()). Seems this would be common, but I am having no luck.

At first I was using PyXML to do what I am trying to do, but it was huge in memory (I guess because it keeps up with direct parent/sibling pointers for each node). Anyway, how would one grab the parent? Also, is there an easy way to grab the next sibling?

Thank you!

--
Jason
Fredrik LundhPerson was signed in when posted  49
02-10-2005 02:21 PM ET (US)
To use an arbitrary parser to build trees, you need to create a wrapper that maps parser events to TreeBuilder calls. See the API docs for some more details.

An alternative solution is to use a validating parser that creates a light-weight model, and convert that model to an Element tree after parsing. I've posted an PyRXP(U)-based example to:

http://online.effbot.org/2005_02_01_archive.htm#elementrxp
Adam Collard  48
02-09-2005 10:05 AM ET (US)
I'm trying to use (the excellent - thank you) ElementTree to process XML files with general external entities (which refer to a 'SYSTEM' file)

My understanding is that ElementTree uses expat by default which, as a non-validating parser doesn't automatically resolve the entities. I have tried using the xmlval module from the xmlproc package and use that instead of expat but am having no success. This is the code I'm using:

----

from elementtree import ElementTree
from xml.parsers.xmlproc import xmlval


p = xmlval.XMLValidator()

tree = ElementTree.parse('foo.xml',p)
root = tree.getroot()
root == None # True
----

foo.xml looks like this:
----
<!DOCTYPE example SYSTEM "example.dtd"
[
<!ENTITY BAR SYSTEM "/home/user/bar.xml">
]>
<test>
  &BAR;
</test>
----

Is this the kind of approach I should be taking? Are there any examples on using ElementTree with an XML document with external entities similar to above?

Thank you in advance for your assistance,

Adam
Kent Johnson  47
02-07-2005 02:43 PM ET (US)
Just a quick thank you for cElementTree. I am writing some code to do some ad hoc matching between elements of two XML files. The total file size is almost 8MB. As I develop the matching program I process the files over and over. I can read both files and iterate all their elements in the blink of an eye.

One of the files is abstracted from a 24MB file. I used cElementTree to create the abstract, too. I tried dom4j but it choked. Text editors choke on the file too. cElementTree ate it up and was hungry for more.

Thanks!
Fredrik LundhPerson was signed in when posted  46
02-07-2005 01:11 PM ET (US)
"Is it possible to use ElementTree to add processing instructions?"

Not really; if you want comments and PI:s to appear before the root element, you currently have to print them out yourself...

(I'm working on a new writer with better support for various things, including "invisible" elements which lets you add comments and PI:s on the document level, but it's not ready for release.)
Ramon M. Felciano  45
02-02-2005 10:12 PM ET (US)
Is it possible to use ElementTree to add processing instructions? I'm trying to add a reference to a stylesheet to my generated XML as follows:

root = Element("projectlist")
pi = ProcessingInstruction("xml-stylesheet","""type="text/xsl" href="projects.xslt" """)
root.insert(0,pi)

This inserts the PI under the root node; what I want is to insert it *before* the root node. I tried inserting it into an ElementTree instance, but it doesn't have an insert method and only seems to know about its root (i.e. no PIs that might preceed it) -- any suggestions?

Thanks!

Ramon
Fredrik LundhPerson was signed in when posted  44
01-27-2005 06:26 AM ET (US)
"Has the write() code been optimized at all in cElementTree?"

I'm afraid not; it uses the ElementTree serialization code. With cET 1.0 out of the door, better serialization code (for both libraries) is high on my list.

(note that the XML files you included are semantically identical, from an infoset perspective. computers should not care about the transformation...)
David Niergarth  43
01-26-2005 11:47 PM ET (US)
"But it was a pretty nice little piece of deceptive code, don't you think?"

Well it was certainly quite an optimization! Mixing the deceptive generator expression with timeit was especially sly. You should haul it out again on April Fool's Day.
Ramon M. Felciano  42
01-26-2005 07:27 PM ET (US)
Has the write() code been optimized at all in cElementTree? I'm seeing very nice speedups compared with libxml2 when parsing and processing, but I lose them all when I serialize out to disk. I also note that in a simple parse/write test, cElementTree does not appear to write out the identical XML file (adds in liberal namespace info to nodes). For example, given this GraphML file as input:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:y="http://www.yworks.com/xml/graphml">
  <graph id="G" edgedefault="directed">
 <node id="foo"/>
</graph>
</graphml>

cElementTree.write() will produce the following:

<ns0:graphml xmlns:ns0="http://graphml.graphdrawing.org/xmlns/graphml">
  <ns0:graph edgedefault="directed" id="G">
 <ns0:node id="foo" />
</ns0:graph>
</ns0:graphml>

Any suggestions?
Fredrik LundhPerson was signed in when posted  41
01-26-2005 04:21 PM ET (US)
"I think there's a problem with your latest benchmark"

You're 100% correct. The benchmark is flawed, on purpose (hence the evil laugh ;-). But it was a pretty nice little piece of deceptive code, don't you think?
David Niergarth  40
01-26-2005 03:37 PM ET (US)
[Had to edit my post -- had pasted the same timing twice, now fixed.]
David Niergarth  39
01-26-2005 03:33 PM ET (US)
Edited by author 01-26-2005 03:35 PM
I think there's a problem with your latest benchmark: it looks like you're only measuring the time it takes to create the generator object. It doesn't look like your code is executing. If you wrap the generator in a list(), to force execution, the timing takes longer. (I'm using the tiny samples/simple.xml from the cElementTree zip, below.)

(dn@change)(02:22P)
(%:~/src/cElementTree-1.0-20050126)- python2.4 -m timeit -s "import cElementTree" "matches = (elem.get('value') for event, elem in cElementTree.iterparse('samples/simple.xml') if elem.get('name') == 'reselectApi')"
10000 loops, best of 3: 23.9 usec per loop

(dn@change)(02:23P)
(%:~/src/cElementTree-1.0-20050126)- python2.4 -m timeit -s "import cElementTree" "matches = list(elem.get('value') for event, elem in cElementTree.iterparse('samples/simple.xml') if elem.get('name') == 'reselectApi')"
10000 loops, best of 3: 116 usec per loop
Torsten Marek  38
01-24-2005 03:03 PM ET (US)
http://www.intel.com/products/notebook/pro...celeron_m/index.htm
says that Celeron M processors go only up to 1.5 GHz, and 8600s are delivered with Celeron M or Pentium M.

I seem to have too much time at my hands to actually care about that;-). On the other hand, google is just too handy...
Paul Boddie  37
01-24-2005 02:37 PM ET (US)
The article says: "It's a Centrino 1.7GHz, which is about equivalent to a P4-3GHz"

Unfortunately, this is unlikely to be true, and you can see that from the Pystone results. It may be the case that Centrino bundles use Pentium-M CPUs, but for all I know (and care) it could be a Celeron on the motherboard. Anyway, apart from all the power-saving horseplay that could go on, the main difference between Pentium-M and P4 is probably the cache - again, from my own limited knowledge of the field and my P2-266-based perspective of the world.
Torsten Marek  36
01-24-2005 01:43 PM ET (US)
Just another note to the pystone numbers: When I read the Uche's article this morning, I computed the pystone for my desktop machine, and although the CPU is less powerful (1.4 GHz Athlon), the came out higher. On my ThinkPad T41p, which has 512MiB of RAM and a 1.7GHz Pentium-M and therefore comparable to his Dell notebook, I get the following numbers:
Pystone(1.1) time for 50000 passes = 1.33
This machine benchmarks at 37594 pystones/second

Python is 2.3.4. Either my notebook is magically enhanced (which it of course is), Debian is outrageously faster than FC3 (true, too) or something is wrong about his numbers.
Anyway, keep up the good work, I look forward to adding (c)ElementTree packages to Debian soon and using it in my own software!
Fredrik LundhPerson was signed in when posted  35
01-24-2005 12:32 PM ET (US)
Edited by author 01-24-2005 12:34 PM
"are we to conclude that this means namespace declarations which come into and go out of scope on the same element will be nested appropriately"

That's the intention, at least. I guess I have to add some more tests to make sure that this is always true...
infidel  34
01-24-2005 12:20 PM ET (US)
I love the new iterparse function, but one thing about the example you give bothers me. The SAX documentation warns that the order in which namespaces start and end may not be nested "properly" with respect to each other. I see that the 'end-ns' event from iterparse just returns None, are we to conclude that this means namespace declarations which come into and go out of scope on the same element will be nested appropriately?
Fredrik LundhPerson was signed in when posted  33
01-22-2005 09:44 AM ET (US)
Edited by author 01-22-2005 09:49 AM
cElementTree in the standard library? why not; it wouldn't be the first thing I've written that makes it into the library... but I'm not going to lobby for that myself; I usually leave such things to enthusiastic users (that was a hint ;-)
Stewart Midwinter  32
01-21-2005 05:44 PM ET (US)
thanks Timothy for that tip. I am aware of the Python Properties file, and frequently use it to alternate between different Python versions on my PC. In this case, though, I had updated to 2.4 and forgot to also update that Properties file.

S
Timothy Fitz  31
01-21-2005 09:27 AM ET (US)
Edited by author 01-21-2005 09:28 AM
By Default, SciTE doesn't "just call 'python'", though it looks like it does. Check your python.properties (Options -> Open python.properties) at the very bottom of that file are two lines that have "C:\python23\python.exe" and "C:\python23\pythonw.exe", you can modify to your hearts content. (If you don't have Open python.properties as an option, you're running the single-executable version, the easy solution is to grab the full version)
daf  30
01-20-2005 12:21 PM ET (US)
Hey, don't you think it would be a good idea to have cElementTree in python standard library?? I find it way more useful than current standard xml lib.. besides it would contribute in making python faster at least in XML processing.

Just want to thank you for your great work, I use this little thing almost everywhere.
Stewart Midwinter  29
01-20-2005 10:50 AM ET (US)
<smack!> (sound of palm slapping forehead). Now the light goes on. I tested the tkFileDialog from 2.4 by opening it in the SciTE editor, then executing it from there. By default, SciTE just calls "python", and the default python installation for me is 2.3, even if I have 2.4 (and 1.52) installed as well.

I'll go back and run it from a command prompt and see what happens. I'm sure it will work properly then.

Thanks!
S
Fredrik LundhPerson was signed in when posted  28
01-20-2005 06:52 AM ET (US)
That SF bug might explain the la-la-land, but it doesn't explain why the exceptions you got under 2.4 were identical to the exceptions I got when I tried to use 2.4's Tkinter with a 2.3 interpreter. Methinks you reported two bugs, one of which was a pilot error.
Stewart Midwinter  27
01-20-2005 12:04 AM ET (US)
good suggestion to file a bug - someone else had recently noticed the same problem, filed a bug, and a solution. I may be able to patch my own copy of tkFileDialog, or get the tip versio off the CVS tree.
The bug is here:
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=852314
thanks
S



On 19 Jan 2005 21:16:57 -0000, QT - Fredrik Lundh
<qtopic+28-v2dA7ee7u55Jx@quicktopic.com> wrote:

> Stewart, running the tkFileDialog and tkFont modules as scripts
> gives the same error on my machine (not that it's an error
> inside Tkinter itself), but both modules seem to work just fine
> if you call the functions from inside a program. Please report
> it to the Python developers:
> http://sourceforge.net/bugs/?group_id=5470

--
Stewart Midwinter
stewart@midwinter.ca
stewart.midwinter@gmail.com
Fredrik LundhPerson was signed in when posted  26
01-19-2005 04:43 PM ET (US)
Edited by author 01-20-2005 03:13 AM
As far as I can tell, ElementTree is a bit slower with psyco than it is without it. It's simply too much C code involved (pyexpat), and too many calls from C code to Python.

(note that pure-Python parsers are typically 1000-10000 times slower than cElementTree. it's a lot more than just a straightforward translation to C...)
Fredrik LundhPerson was signed in when posted  25
01-19-2005 04:16 PM ET (US)
Edited by author 01-19-2005 04:24 PM
Stewart, running the tkFileDialog and tkFont modules as scripts gives the same error on my machine (not that it's an error inside Tkinter itself) only when I've messed up the paths. I tried running the 2.4 modules using a 2.3 interpreter; doesn't work. With the right versions, everything works just fine.
happy_broccoli  24
01-19-2005 04:04 PM ET (US)
To Frederik RE: cElementTree

I have generally found that when i go from pure python to c-python, that the cpu performance benefits match those that you would get from applying psyco to the pure python.

Not sure about memory, and of course you get greater portability with c-python. Anyway, it would be nice if you could add psyco to your list of tests.
Stewart Midwinter  23
01-18-2005 09:21 PM ET (US)
tkFileDialog bug?

Perhaps it's just me, but I'm unable to get the askdirectory function in tkFileDialog.py to work in Python 2.3.4. when invoked using the easygui module's diropenbox method, the tkFileDialog module goes off to la-la land, never to return.

Next, I tried editing the tkFileDialog module directly, adding the following to __main__:
    getdirectory=askdirectory()
    print "directory", getdirectory.encode(enc)
Same results.

I've also just downloaded and installed Python 2.4 and tried the same function, and in this case an exception is returned:

Traceback (most recent call last):
  File "tkFileDialog.py", line 202, in ?
    openfilename=askopenfilename(filetypes=[("all files", "*")])
  File "tkFileDialog.py", line 125, in askopenfilename
    return Open(**options).show()
  File "C:\Programs\Python24\Lib\lib-tk\tkCommonDialog.py", line 48, in show
    w = Frame(self.master)
  File "C:\Programs\Python24\Lib\lib-tk\Tkinter.py", line 2374, in __init__
    Widget.__init__(self, master, 'frame', cnf, {}, extra)
  File "C:\Programs\Python24\Lib\lib-tk\Tkinter.py", line 1855, in __init__
    BaseWidget._setup(self, master, cnf)
  File "C:\Programs\Python24\Lib\lib-tk\Tkinter.py", line 1830, in _setup
    _default_root = Tk()
  File "C:\Programs\Python24\Lib\lib-tk\Tkinter.py", line 1569, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
TypeError: create() takes at most 5 arguments (8 given)
>Exit code: 1
>python -u "tkFont.py"
Traceback (most recent call last):
  File "tkFont.py", line 190, in ?
    root = Tkinter.Tk()
  File "C:\Programs\Python24\Lib\lib-tk\Tkinter.py", line 1569, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
TypeError: create() takes at most 5 arguments (8 given)

what could I be doing wrong?
Tim Cradic  22
01-18-2005 09:38 AM ET (US)
I am using WinXP Pro and Python 2.3.4 with the 11/04 version of win32all and mscvrt.kbhit() isn't responding properly. I also tried Python 2.4 with it's win32all and no difference. When I run the example-2 code, I press the spacebar repeatedly and the spaces show up in front of the "."s, but the periods keep on coming, pushing the spaces and cursor along in front of them. I am wanting to use your Console to instruct the user with dynamic status. I can get the Console to alert me when a key is pressed, but how do I kill the console when I'm through with it? A small example using the event subtypes and attributes, especially "state()", would be greatly appreciated.
Fredrik Lundh  21
01-17-2005 01:53 PM ET (US)
> I downloaded cElementTree 0.9.2 (python 2.3) and when I call
> getchildren() I get a Windows XP Error Message. The call to
> this function runs fine when I use ElementTree.

I've confirmed this in 0.9.2, and it will be fixed in the next release.
Note that getchildren() is deprecated, and it will be removed in some distant future. Instead, you should simply treat the element node as the sequence it is. (for item in elem, elem.append, elem.remove, len(elem), elem[x], etc)

(if you really need a list, use list(elem)).

</F>
Ryan  20
01-17-2005 01:30 PM ET (US)
I downloaded cElementTree 0.9.2 (python 2.3) and when I call getchildren() I get a Windows XP Error Message. The call to this function runs fine when I use ElementTree.

Thanks
Fredrik LundhPerson was signed in when posted  19
01-15-2005 08:42 AM ET (US)
tim, if all you need is to check for a key press, you can use Python's standard msvcrt module. see the second example on this page:

http://effbot.org/librarybook/msvcrt.htm

if you need the console module for other reasons, use the peek method to handle incoming events without blocking.
Fredrik LundhPerson was signed in when posted  18
01-14-2005 12:34 PM ET (US)
findtext doesn't convert anything; "& l t ;" in the original XML file is how "<" is encoded in XML. If you're writing things out as, say, HTML, you have to encode them again, according to the HTML rules.

As long as you stick to ASCII, you can use the cgi.escape function to convert a string to HTML.
Bill Oldroyd  17
01-14-2005 09:29 AM ET (US)
When I say < to < , I mean & l t ; to < .

:-)

Bill
Bill Oldroyd  16
01-14-2005 09:28 AM ET (US)
I am using findtext to extract data from one XML instance to another. findtext converts < to < etc., which is most useful when you want text, but it means I have to convert the < back again to < .

Is there any way of avoiding this ?.

Sorry if this is a simple qestion. I find ElementTree very easy to use.

Bill
bill.oldroyd@bl.uk

[I am using ElementTree to help create a gateway to convert between NLM Entrez web service for PubMed and a standard HTTP search protocol SRU - if anyone is interested.]
Tim Cradic  15
01-14-2005 09:07 AM ET (US)
I am trying to use your Console module to set up a non-blocking keypress poll routine. I am running a background function and want to break out if a key is pressed. Do you have any examples that show how to use the Console for this purpose?

Thank you.
Fredrik LundhPerson was signed in when posted  14
01-13-2005 09:44 AM ET (US)
it's a compatibility glitch: cElementTree uses ET 1.3 names, and doesn't provide an ET 1.2-compatible alias for the parser class. so to use the XMLTreeBuilder directly, you have to access it as XMLParser.

I'll fix this in the next cET relase.
Richard Sharp  13
01-13-2005 09:30 AM ET (US)
Am I missing something here. I tried to slot in cElementTree where I previously imported ElementTree. I try then to use XMLTreeBuilder, which is obviously not in the c-program but is in the python program. Is there a way I have overlooked for getting round this?
Ming  12
12-19-2004 10:40 PM ET (US)
Edited by author 12-22-2004 09:37 PM
Hi,

Could someone try to feed in a Google search results page into TidyHTMLTreeBuilder and see if it crashes? Mine crashes every time with a form inside a table. I wonder if it's just me here. Please confirm.

Edit:
** This is a page you can try: (http://www.google.com.au/search?hl=en&q=effbot&meta=)Please help. This is urgent.

Thank you very much in advance
Stewart Midwinter  11
12-19-2004 08:35 PM ET (US)
Edited by author 12-19-2004 08:37 PM
hi, the alternative method for dealing with too-long strings in the Validated Entry widget on this page: http://effbot.org/zone/tkinter-entry-validate.htm
and described as follows:
'Note that if the user pastes a long string into the entry box, it will be rejected by this implementation. A better solution might be to change the validate method to:
    def validate(self, value):
        if self.maxlength:
            value = value[:self.maxlength]
        return value

appears to throw an exception if you type more than maxlength characters. My thought is that the validation method cannot deal with the situation where you want to keep typing past maxlength characters. Instead, you need to define a getresults method that will return all of the validated string except in the case of a ChopLength class; in that case it will return only maxlength characters.

Also, if you attempt to set an initial value for the entry fields, it will only be accepted for the integer or float entry fields. The init method for the MaxLength subclass has to be modified to pass the value argument through.

Lastly, if you set an initial text that is longer than MaxLength class will allow, you will be unable to edit it in any way. As described below, I've added code to deal with this.

I've described a solution on the Tkinter wiki at:
http://tkinter.unpy.net/wiki/ValidateEntry
Fredrik LundhPerson was signed in when posted  10
12-19-2004 04:26 PM ET (US)
The ElementTree class is not really designed for round-tripping of human-authored documents; the parsers are only concerned about the infoset, and the tree writer will happily use its own way to encode things, completely ignoring whether something was originally a character reference or an entity or a CDATA section, etc.

You can add comments to trees, though, so it should be possible to tweak one of the parsers so it preserves comments. I'll see if I can dig up an example...
Alan  9
12-18-2004 04:09 PM ET (US)
Preserving comments in ElementTree

Is there a way to preserve comments using ElementTree?

I thought it might have been something I was doing, but even the simplest possible round trip:

ElementTree.ElementTree(file='file-with-comments.xml').write('file2.xml')

... loses comments that were in the original file. Which is catastrophic (ok, means going and getting the backup copy) if you are actually writing back to the same file and had temporarily (you thought) commented something out.
Ming  8
12-16-2004 01:08 AM ET (US)
Hi,

Does TidyHTML ignore some tags e.g. <caption>? When I write the tree back out, the <caption> tags disappear. Please help.

Cheers,
Michael
Fredrik LundhPerson was signed in when posted  7
12-15-2004 02:25 AM ET (US)
in the body case, the "hello" text ends up in the 'text' attribute of the body node (in HTML/CSS terminology, this is known as an "anonymous block").

in the other cases, the trailing text is stored in the 'tail' attribute of the preceeding element; see:

http://effbot.org/zone/element-infoset.htm#mixed-content
Ming  6
12-14-2004 11:54 PM ET (US)
Edited by author 12-14-2004 11:54 PM
Hi all,

With <html><body>hello</body></html> how is "hello" stored in the element tree? Which node is it under? Similarly, with:
foo blah bar, how is bar stored? Which node is it in?

Cheers,
Ming
Ming  5
12-14-2004 11:33 PM ET (US)
Deleted by author 12-14-2004 11:50 PM
Michael  4
12-13-2004 11:00 PM ET (US)
Hi,

I'm using TidyHTMLTreeBuilder, along with ElementTree, and they are excellent! However, I'm experiencing some problems when throwing CNN.com's website into it. When I do that, I get the following error:

"..in parse
    tree = TidyHTMLTreeBuilder.parse(source)
  File "C:\Python23\Lib\site-packages\elementtidy\TidyHTMLTreeBuilder.py", line 89, in parse
    return ElementTree.parse(source, TreeBuilder())
  File "C:\Python23\lib\site-packages\elementtree\ElementTree.py", line 865, in parse
    tree = ElementTree()
  File "C:\Python23\lib\site-packages\elementtree\ElementTree.py", line 590, in parse
    parser.feed(data)
  File "C:\Python23\Lib\site-packages\elementtidy\TidyHTMLTreeBuilder.py", line 75, in close
    return ElementTree.XML(stdout)
  File "C:\Python23\lib\site-packages\elementtree\ElementTree.py", line 879, in XML
    parser.feed(text)
  File "C:\Python23\lib\site-packages\elementtree\ElementTree.py", line 1169, in close
    def close(self):
ExpatError: no element found: line 1, column 0
>>> "

It seems to me that, the program crashes when there's a form around all rows, ie. <table><form><tr><td>test</td></tr></form></table>

Could someone else try it on their system?

Cheers,
Michael
MatsunoTokuhiro  3
12-13-2004 09:28 AM ET (US)
I wrote GoogleSuggest library, named PyGoogleSuggest.
http://tokuhirom.dnsalias.org/~tokuhirom/w...yGoogleSuggest_2den
Fredrik LundhPerson was signed in when posted  2
12-12-2004 05:33 PM ET (US)
Google moves in mysterious ways...
Michael HudsonPerson was signed in when posted  1
12-09-2004 11:22 AM ET (US)
I like the way your google ads are now all about Hamlet.
RSS link What's this?
All messages            0-95 of 95        
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.