Let me caveat this rant with the fact that I've only been playing with
matplotlib for approximately a week.
All the demos made matplotlib (a python module) look like a great tool that I
should want to use to graph things, then I started trying to actually write
code and it all went downhill.
Almost all of the functions operate on some mystical global scope, meaning they
are by design not threadsafe. Probably not a big deal, I guess, but it
certainly feels like an alien world especially given all the object oriented
code in use in python.
If this culture shock wasn't bad enough, it went ahead and decided to use
inches and ratios as the standard units of measure. You make a figure of a set
width and height (in inches) and you can put stuff in that figure given ratio
offsets. An offset of '.5' would put your left-bound in the middle. Weird and
unexpected. Perhaps not bad. Still, I'm used to pixels, not inches.
Some of the arguments are just looney:
fig.add_subplot(111)
The docs say this "subplot(211) # 2 rows, 1 column, first (upper) plot".
Base10 flag system? What. the. F. I'm at a loss as to why this was ever a good
idea. Let's make it hard to add plots? Looks like you can use 'subplot(rows,
cols, plotnum)' which is the sensible solution, but all the demos use the
integer syntax, and it makes me sad.
You can't easily put the legend outside the graph.
Setting the default font size means you have to set at least 6 things. Make
sure you note the excessive use of different tokens for the same freaking
setting: labelsize, titlesize, size, and fontsize.
rc("axes", labelsize=10, titlesize=10)
rc("xtick", labelsize=10)
rc("ytick", labelsize=10)
rc("font", size=10)
rc("legend", fontsize=10)
I have code that looks like this:
fig = figure()
p = subplot(111)
line = p.plot_date(dates, values)
line[0].set_label("foo")
legend()
fig.savefig('foo.png', format='png')
Notice my entertaining leaps between OOP and WTF. Other cute nuances are that
the docs/examples are littered with:
ax = subplot(111)
You might think that the name 'ax' means 'axis' and that subplot returns an
axis. No. You might ask python with type() and it would say '<type
'instance>'. Helpful. If you just print ax you'll see it is
matplotlib.axes.Subplot. I'm trying hard to not get hung up on semantics, but
'axis' to me is very different from a plot. Plot seems like a visual
representation, and an axis is a single dimension of a graph (aka a plot).
After several days of playing with this tool, I am frustrated and disheartened.
It has such powerful features like tick rules: You can trivially specify "Put
one major tick every 3rd week". However, the api is half OO half
globally-scoped-procedural. Maybe this is my fault. The docs constantly mix
'matplotlib' and 'pylab' methods. Perhaps you can use just the matplotlib
functions by themselves and you don't need pylab? Pylab, by the way, is what
provides these awkward global functions and in theory only exists as a pure
wrapper on top of matplotlib.
Comments: 3 (view comments)
Tags: rants, matplotlib
Permalink: /geekery/matplotlib-induces-self-loathing
posted at: 05:39
So a short while ago I published the tabsearch firefox extension. I thought to myself, "Why not put it up on addons.mozilla.org?"
To publish, you need to submit it to the addons review system. Submitting it puts it in the "sandbox". To leave the sandbox and go public it must be nominated. To pass nomination it must meet a large set of criteria, all of which make some amount of sense with respect to quality assurance, etc.
I've submitted it 3 times. Every time it's been denied for different reasons. The first time was half reasonable, because one of the reasons was "Remove those debugging statmenets". Other reasons have been:
- "Document your preferences"
- tabsearch doesn't have any options, preferences, or tweakables
- "Your extension must have atleast one review from one of your users"
- Do I have a QA team who can review this for me? I thought the reason I
was publishing it on mozilla addons was to get users. Seems like an awkward
bootstrapping problem I'm not going to bother solving.
- "Make the key binding configurable"
- That's what keyconfig is for :(
While I entirely agree that quality assurance through a review process is a
great and useful idea, I think the firefox addons policies and reviewership
group have taken it a bit far. There are only so many revisions I'm willing to
do for the sake of publishing somewhere else. So, until I can find more time to
throw at getting published at mozilla addons, you can expect to only find
tabsearch here.
Benjamin Franklin wrote a blurb about perfection, '"Yes," said the man, "but I
think I like a speckled axe best."'. Most of the time, perfection isn't worth
the effort when something is already good enough.
I don't mean to discourage people from submitting to mozilla addons, but after
3 attempts it's really not worth it. Basically, the fine, nearly-unwritten
print in the policy is that you need real people to have submitted very
detailed reviews of your extension before it'll be approved.
Comments: 0 (view comments)
Tags: rants, beaurocracy
Permalink: /rants/submitting-to-firefox-addons
posted at: 02:19
-bash-3.1# yum install django
No Match for argument: django
Nothing to do
-bash-3.1# yum install Django
Downloading Packages:
(1/1): Django-0.95.1-1.fc 100% |=========================| 1.5 MB 00:02
Ahh. Clearly.
Comments: 0 (view comments)
Tags: rants, fedora, linux
Permalink: /rants/fedora-yum
posted at: 01:48
There once was a database named MySQL.
It had a query cache, becuase caching helps performance.
It also had queries you could "prepare" on the server-side, with the hope that
your database server can make some smart decisions what to do with a query
you're going to execute N times during this session.
I told mysql to enable it's caching and use a magic value of 1gb for memory storage. Much to my surprise, I see the following statistic after testing an application:
mysql> show status like 'Qcache_%';
+-------------------------+------------+
| Variable_name | Value |
+-------------------------+------------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 1073732648 |
| Qcache_hits | 0 |
| Qcache_inserts | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 814702 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 1 |
+-------------------------+------------+
8 rows in set (0.00 sec)
Why are so many (all!?) of the queries not cached? Surely I must be doing
something wrong. Reading the doc on caching explained what I can only
understand as a complete lapse of judgement on the part of MySQL developers:
from http://dev.mysql.com/doc/refman/5.0/en/query-cache.html
Note: The query cache is not used for server-side prepared statements. If you're using server-side prepared statements consider that these statement won't be satisfied by the query cache. See Section 22.2.4, C API Prepared Statements.
Any database performance guide anywhere will tell you to use prepared
statements. They're useful from both a security and performance perspective.
Security, becuase you feed the prepared query data and it knows what data types
to expect, erroring when you pass something invalid. It also will handle
strings properly, so you worry less about sql injection. You also get
convenience, in that you don't have to escape your data.
Performance, becuase telling the database what you are about to do lets it
optimize the query.
This performance is defeated, however, if you want to use caching. So, I've got
a dillema! There are two mutually-exclusive (because MySQL sucks) performance-enhancing options available to me: using prepared statements or using caching.
Prepared statements give you two performance benefits (maybe more?). The first,
is the server will parse the query string when you prepare it, and execute the
"parsed" version whenever you invoke it. This saves parsing time; parsing text
is expensive. The second, is that if your database is nice, it will try to
optimize your queries before execution. Using prepared statements will permit
the server to optimize query execution once, and then remember it. Good, right?
Prepared statements improve CPU utilization, in that the cpu can work less
becuase you're teaching the database about what's coming next. Cached query
responses improve disk utilization, and depending on implementation should
vastly outperform most (all?) of the gains from prepared statements. This
assumption I am making is based on the assumption that disk is slow and cpu is
fast.
Cached queries will (should?) cache results of complex queries. This means that
a select query with multiple, complex joins should be cached mapping the query
string to the result. No amount of statement preparation will improve complex
queries becuase they still have to hit disk. Large joins require lots of disk
access, and therefore are slow. Remembering "This complex query" returned "this
happy result" is fast regardless of whether or not it's stored on disk or in
memory. Caching also saves cpu utilization.
I can't believe preparing a query will prevent it from being pulled from the
query cache, but this is clearly the case. Thanks, MySQL, for making a stupid
design decision.
Maybe there's some useful JDBC (oh yeah, the app I'm testing is written in
Java) function that'll give you all the convenience/security benefits of
prepare, but without the server-side bits, and thus let you use the query
cache.
Comments: 2 (view comments)
Tags: mysql, rants, performance
Permalink: /geekery/mysql-prepare-queries-not-cached
posted at: 21:26
I see lots of times where people put their mailing addresses as "foo at bar dot
org" in a hopeful effort to keep spammers from scraping your mailing address.
Heck, mail archive systems often have (and are deployed with) options to
obfuscate email addresses systematically, using the same pattern: foo at bar dot com.
All it does is hurt usability.
Googlng for "* at * dot *" clearly shows lots of matches. It also matches all of the following variants, due to google searches ignoring brackets and such in words:
- foo at bar dot com
- foo [at] bar [dot] com
- foo (at) bar (dot) com
- ... etc ...
Query, scrape, replace 'at' and 'dot' as desired. I now have 54 million email addresses. What now?
Seems like this effort only serves to have people fool themselves as well as to
impede usability. It certainly won't protect you from spam. Why is this method used?
Comments: 2 (view comments)
Tags: spam, mail, google, rants
Permalink: /geekery/anti-spam-obfuscation-easily-defeated
posted at: 22:55
So, I've been reading docs on python's xml stuff, hoping there's something
simple or comes-default-with-python that'll let me do xpath. Everyone
overcomplicates xml processing. I have no idea why. Python seems to have enough
alternatives to make dealing with xml less painful.
Standard python docs will lead you astray:
kenya(...ojects/pimp/pimp/controllers) % pydoc xml.dom | wc -l
643
Clearly, the pydoc for "xml.dom" has some nice things, right? I mean, documentation is clearly an indication that THE THING THAT IS DOCUMENTED BEING AVAILABLE. Right?
Sounds great. Let's try to use this 'xml.dom' module!
kenya(...ojects/pimp/pimp/controllers) % python -c 'import xml; xml.dom'
Traceback (most recent call last):
File "", line 1, in ?
AttributeError: 'module' object has no attribute 'dom'
WHAT. THE. HELL.
Googling around, it turns out that 'xml' is a fake module that only actually works if you have it the 4Suite modules installed? Maybe?
Why include fake modules that provide complete documentation to modules that do not exist in the standard distribution?
Who's running this ship? I want off. I'll swim if necessary.
As it turns out, I made too-strong of an assumption about python's affinity
towards java-isms. I roughly equated 'import foo' in python as 'import foo.*'
in java. That was incorrect. Importing foo doesn't get you access to things in
it's directory, they have to be imported explicity.
In summary, 'import xml' gets you nothing. 'import xml.dom' gets you nothing.
If you really want minidom's parser, you'll need 'import xml.dom.minidom' or a
'from import' variant.
On another note, the following surprised me. I had a module, foo/bar.py. I
figured 'from foo import *' would grab it. This means 'from xml.dom import *'
doesn't get you minidom and friends.
Perhaps I was hoping for too much, but maybe it's better to import explicitly.
If that's the case ,then why push exceptions that allow '*' to be imported only
from modules, not packages?
Comments: 2 (view comments)
Tags: rants, python, xml
Permalink: /geekery/python-and-xml
posted at: 21:23
Happy Halloween, folks. It's been 20 days since my last post. I've been
incredibly busy with work and haven't had a chance to write. As a gift, I give
you a rant.
I've been through no less than 3 DNS service providers in the past week, and
all of them suck. They suck hard.
The first one I looked at was no-ip. No-IP claims they support 'dynamic dns' -
they don't. The first thing you must realize about almost all dns providers is
that while they claim they support "dynamic dns" and/or "round robin," what
they really mean is their support of 'dynamic dns' is based solely around one
single use case. One.
What is that use case? The following picture comes from dynu.com:
What is this? This use case of one computer updating it's own hostname with
whatever IP it happens to have at that moment. Businesses can't possibly find
this useful. It doesn't scale. If you have more than one server you want to put
on a single hostname, this use case fails you miserably.
I've looked at no-ip, dyndns, dnspark, and several others. Trash.
Keep in mind, this rant is becuase both free AND pay-for dns providers suck.
Both kinds. Free services actually have an excuse - you get what you pay for.
As a precursor, let me explain what I need from a dns provider:
- The ability to add and remove dns entries of any record type, at any time.
- The ability to add multiple entries for the same record
Many claim these features. Those I tried fail miserably.
If you are in the market for a real dns provider, as I am, you'll find many dns
providers claiming what I listed above. "Sure! We support round robin!" they
advertise, "We support dynamic dns!"
What they don't tell you in the same paragraph is that you have to use their
own HTTP-based means of pushing dns changes. They absolutely don't tell you
that their pathetic attempt at providing this "dynamic" service via a cgi-like
interface is absolutely crippled.
Several providers allowed you to mutate records dynamically. However, none of
them I tried let me add multiple entries for a single record using the dynamic
interface.
An important realization is that my definition of dynamic is not the
same as these dns providers' notion of dynamic. This so-called dynamic dns
ability hinges on customers who want to be able to host crap out of their
dynamic-ip-giving ISP. As such, most of the interface is just "Hey DNS
provider! Please update www.foo.com with whatever IP this packet is coming
from! Thanks!" This is intolerable!
What is my definition of "dynamic dns," exactly? Let's call it RFC 2136. Heck, I don't care if it's not RFC 2136, just that I'm able to do most things that update specification provides.
To quote ZoneEdit customer support regarding my issues with their service and
in particular how to properly use their crippled dynamic update interface:
"You can atleast update hourly .
Updating too often with the same IP address gets your account locked up."
WHAT?! Once hourly? Shit. DNS is hard. Let's go shopping instead.
Doing this right is not hard. For example, I recently posted
an article
on how to setup dynamic dns and make your dhcp server talk sweetly to dns. I
use this same configuration in my apartment. MY APARTMENT. My apartment is
considerably smaller than, say, a multidatacenter dns provider. Why doesn't
anyone at any of these dns providers have a freaking clue about running a dns
server? Let me put it plainly:
I will give you money and you will give me a
dnssec key and a server on which to use it. That shall be the extent of our
relationship
That's all I want. The worst part is that it doesn't matter who you go with.
There are plenty of free dns providers who provide you the same crappy service
as give-us-your-money providers.
Really. Come on kids.
Look at it this way - To enable dynamic dns updates, you don't need to write
any code. A few tiny named.conf changes. To provide a pathetic http interface
you label as "dynamic dns" requires lots of lines of code, lots of testing, and
$$$ invested in this kind of product.
To further show how stupid this is. Microsoft supports this properly.
Microsoft. You know, that company everyone hates-on for proprietary protocols
and ignorance of standards? Microsoft DNS will send updates using BIND's update
protocol. How do I know this? I've had a primary dns server running BIND and
Microsoft DNS running as a secondary. I told Active Directory that it's primary
dns was the BIND server. Guess what happened? Active Directory happily
submitted updates to my BIND server. Correctly.
You might be thinking to yourself, "Why don't you just host dns yourself?"
Because I dont' have any servers on a static IP address. And no, this isn't
running out of my apartment.
Am I the only one who can't find a dns provider that doesn't suck?
Comments: 2 (view comments)
Tags: dns, rants, service providers
Permalink: /rants/dns-providers
posted at: 23:15
I followed a webclip link out of gmail today and it dropped me off at a news
story on Forbes.com. I wanted to read this story. However, I was presented with
something horrific. I was presented with the results of a tragic effort that I
can only presume is a scheme to show as many "punch the monkey" advertisements
as possible.
What is this scheme? Well. I landed on the page. This page had two
average-length paragraphs. No sooner had I finished reading the first paragraph
than the page reloaded and showed me another, new piece of text.
Six seconds later. A new page.
Repeat.
Turns out Forbes.com has some sort of slideshow they try to use to display
stories. To make matters worse, there are advertisements everywhere. By the time I
figured out what part of the page I was supposed to be looking at, it went to
the next page. Sure, you can stop the slideshow, but I only found that out afterwards.
Thanks Forbes. I almost read one of your stories.
Clicky for an example article
Thumbnail screenshot of the page follows. Enjoy the massive amount of whitespace and adspace.
Comments: 3 (view comments)
Tags: wtf, user experience, rants
Permalink: /rants/forbes-dot-com-sucks
posted at: 18:47
Add lacking dynamic assignment ability to my "I wish Python had Foo" list.
Python does not appear to have dynamically assignable arrays. Where are we, C? Assembly? When I assign past the end of the array, I mean resize the god damned array. Thanks.
nightfall(~) % python -c "foo = []; foo[3] = 234"
Traceback (most recent call last):
File "<string>", line 1, in ?
IndexError: list assignment index out of range
This is completely unacceptable.
Sure, I can use list comprehensions to make an N element array that's empty:
foo = [None for x in range(100)]
foo[44] = "Hi"
That only gets me an array with 100 empty elements. Uh.. Not what I want. If I
did this on an array with data in it I didn't want to lose, I'd lose all the
data.
Sigh...
Comments: 2 (view comments)
Tags: rants, python
Permalink: /rants/python-problems
posted at: 00:15
|
Search this site
Navigation
Metadata
Home
About
Resume
My Code (SVN Web)
ARP Security
Dynamic DNS with DHCP
OpenLDAP+Kerberos+SASL
PPP over SSH
SSH Security: /bin/false
Week of Unix Tools
Work Efficiency
fex
firefox tabsearch
firefox urledit
grok
keynav
liboverride
newpsm (FreeBSD)
nis2ldap
pam_captcha
poor man's backup
Solaris audio utility
xboxproxy
xdotool
xmlpresenter
xpathtool
misc scripts
Presentations
Yahoo! Hack Day '06
Unix Essentials
Vi/Vim Essentials
Tag Cloud
Calendar
| < |
December 2007 |
> |
| | | | | | | 1 |
| 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| 9 | 10 | 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 | 21 | 22 |
| 23 | 24 | 25 | 26 | 27 | 28 | 29 |
| 30 | 31 | | | | | |
Friends
BarCamp
Kent Brewster
Tantek Çelik
John Resig
Wesley Shields
Tyler Shields
Technorati
|