| Who | When |
Messages | |
|
|
|
jleader
|
60
|
 |
|
02-25-2003 01:11 PM ET (US)
|
|
Someday, Unicode will solve all our problems. Then, we'll have to invent new problems.
|
Chris Smith
|
59
|
 |
|
02-25-2003 11:32 AM ET (US)
|
|
I had a long thing written here about why curly quotes lead to broken feeds. Then I figured out what ACTUALLY happened - it's almost Cory's fault (grin). Here goes... This is Blogger Pro's xml declaration (sans angle brackets) ?xml version="1.0"? This is bb's headers from http://boingboing.net/rss.xml HTTP/1.1 200 OK Date: Tue, 25 Feb 2003 16:46:42 GMT Server: Apache Last-Modified: Tue, 25 Feb 2003 16:22:28 GMT ETag: "7d57-49f5-3e5b9844" Accept-Ranges: bytes Content-Length: 18933 Connection: close Content-Type: text/xml Notice that neither of these specifies a text encoding. Now for the most critical bit from 4.3.3 in the XML spec. It is also a fatal error if an XML entity contains no encoding declaration and its content is not legal UTF-8 or UTF-16. ...and that's where it was going sour. Curly quotes (and foreign currencies and simple fractions) would cause invalid utf-8 sequences to appear, breaking any utf-8 parsers. Given the combination of RSS and headers, it was incumbent on Cory et al to ensure that all postings were in utf-8. This would be difficult though, given that many sites (including Blogger) include no HTML charset declaration. In such cases, the usual fallback is to assume ISO-8859-1. This would *almost* be ok, except that there are no curly quotes in 8859-1, so many systems cheat, and just use the windows-1252 codes. The resulting conflict of assumptions meant that postings to the site went in in ISO-8859-1, but came out in the RSS feed in utf-8. Unfortunately, no actual conversion of data took place at the server to deal with this change. Aaron's instructions have replaced assumptions with solid references. bb now comes out in utf-8 (because of the meta tag in the top), and Mozilla can be told to override the submission in 8859-1 and use utf-8. Mozilla appears to have the smarts to coerce many foreign types into utf-8 correctly, so that cutting and pasting invokes a conversion to the appropriate encodings. Cutting and pasting (particularly the pasting) appears to be a magical process - it changes depending on the various settings of all the various components. This is the clue to the occasional nature of the problem - there are just too many hidden settings to keep track of. The one thing that would make this job easier would be having the server side automatically declare utf-8 for both the bb pages and for the posting pages. Five years ago it would have broken too many browsers. Maybe the time for that change has finally come.
|
Eli the Bearded
|
58
|
 |
|
02-24-2003 06:58 PM ET (US)
|
|
Cory, about /m54, quite possibly the problem is that the clipboard does not contain information about the charset or when pasted that information is need checked. It would match, eg, the epoch problem when cutting and pasting dates between Mac and Windows originated Excel documents. So much is effected by something as seemingly simple as character sets.
|
Cory Doctorow
|
57
|
 |
|
02-24-2003 05:50 PM ET (US)
|
|
Well, hush my mouth! It *does* work -- yer a genius, Aaron.
|
Aaron Swartz
|
56
|
 |
|
02-24-2003 05:43 PM ET (US)
|
|
Cory, as I said, you need to put <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the <head> of the HTML template
|
Cory Doctorow
|
55
|
 |
|
02-24-2003 05:37 PM ET (US)
|
|
Aaron, that doesn't work either. The curly-quotes show up in the RSS, but when published to the blog, they show up as dipthongs.
|
Cory Doctorow
|
54
|
 |
|
02-24-2003 05:36 PM ET (US)
|
|
Chris, look at the original post: the tools aren't there yet. Every tool I've used can and does break in some instances of curly-quotes -- try pasting a word doc with curly-quotes into mutt and mailing it to someone who reads it with Mailsmith -- at least one of those apps is going to render incorrectly, reducing the readability of your system.
The fact is that despite "standardization," curly-quotes are still mostly a proprietary affair.
|
Chris Smith
|
53
|
 |
|
02-24-2003 05:15 PM ET (US)
|
|
I'd love to see how any system thinks it can send or receive paired quotes in ISO-8859-1, since that character set doesn't include the relevant characters.
It's lying to you - the problem is knowing exactly HOW it's lying. That was always the problem I ran into when I hit character problems - there was never any clear documentation about which system bits did which conversion bits, and what spec they used to do it.
|
Chris Smith
|
52
|
 |
|
02-24-2003 04:23 PM ET (US)
|
|
Cory? Hang in there.... 1) This stuff works ok, if not perfectly, when I try it. In fact, some of the not perfect is local - things like my browser which lies to me. 2) There are even tools out there to automatically convert straight quotes (the feet-inches type) to paired quotes ... including some common ones such a MS Word. 3) These are legal constructs in XML and thus in XHTML and RDF. Given that it is both legal and working, how do you explain why people should stop? Fair enough, your tools break. But if they are breaking on what is defined as legitimate content, then is this the originators' fault? Further - a validation run on bb's RSS suggests that the ONLY out-of-valid item is a single date field. Simply adding paired quotes to a otherwise valid RSS feed will not make it invalid. Even the windows-1252 characters are legit XML, they just shouldn't be displayed as quotes. The one-sender / multi-recipient model DEPENDS on standards. You can't be driven by every tool author out there who complains that your feed breaks his reader if your feed is valid. It is NOT 'hiding behind the spec' to point them to a validation check, and tell them that they have to accept valid feeds. This is why validators sometimes show you how to add a 'validate my feed' link to your work. If the first place someone can go is a validator that shows your feed is fine, then that is less likely to be an email to you. Maybe that's a start - something simple to keep the email load down. There are a couple RSS validators, but here's a start. I think it's likely that you've heard of these before - time to see if you can use one to save yourself some workload? http://feeds.archive.org/validator/check?u...boing.net%2Frss.xml
|
Aaron Swartz
|
51
|
 |
|
02-24-2003 04:20 PM ET (US)
|
|
As for the other editors, what browsers do they use?
In Safari: Safari, Preferences, Appearance, Default Encoding, Unicode (UTF-8).
|
Aaron Swartz
|
50
|
 |
|
02-24-2003 04:18 PM ET (US)
|
|
OK, so I downloaded Mozilla and set up a Blogspot account. For some reason beyond puny human comprehension, Mozilla persists in seeing Blogger.com as being in ISO-8859-1 despite all indications and instructions to the contrary.
Luckily, if you visit the blogger posting page and select View menu, Character Encoding, Unicode (UTF-8) it does things right. This seems to stick across quitting the application and new posts.
Tech details: Mozilla can be totally wacky.
|
Eli the Bearded
|
49
|
 |
|
02-24-2003 04:16 PM ET (US)
|
|
Overall, I'd say lack of unicode support is an application problem, not a problem with people using it. Just because you only see it for smart quotes does not make those characters the real problem.
Asumming you've got a unicode system, I like to spell my surname with a "ffi" glyph (U+FB03). Not sure if QT will get it right here.
|
Cory Doctorow
|
48
|
 |
|
02-24-2003 03:48 PM ET (US)
|
|
Doesn't work.
|
Cory Doctorow
|
47
|
 |
|
02-24-2003 03:44 PM ET (US)
|
|
Let's see if that works. Of course, I have three co-editors who don't use Moz.
|
Aaron Swartz
|
46
|
 |
|
02-24-2003 03:40 PM ET (US)
|
|
Edited by author 02-24-2003 03:43 PM
OK, I think I figured out how to fix Mozilla so your RSS feed won't break again, Cory:
From the Mozilla menu, select Preferences. Click Languages in the Navigator category. Under Default Character Coding select "Unicode (UTF-8)". Click OK.
Tech details: Mozilla stuplidly assumes pages are in the legacy ISO-8859-1 format, and so it sends the smart quotes in ISO-8859-1. Web browsers have code to check if the page is using ISO-8859-1 and handle it appropriately but many RSS readers don't, because XML feeds are supposed to do the right thing and use UTF-8. By telling Mozilla to do the right thing and assume UTF-8 also, Blogger gets the right characters, and puts them in the RSS feed correctly.
You should also put <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in the <head> of the HTML template for other browsers that don't assume UTF-8.
|
Cory Doctorow
|
45
|
 |
|
02-24-2003 02:30 PM ET (US)
|
|
Glad it's not a problem for you. Maybe you can answer some of the 75+ emails I've received this weekend about the various curly-quotes that have snuck into BB posts by being pasted in from other blogs, breaking our RSS feed in a variety of readers...
God, the "get better tools" answer is a cop-out. How about "stop breaking the tools that people use to communicate in order to hew to some doctrinaire notion of 'correct' type?"
|
Chris Smith
|
44
|
 |
|
02-24-2003 02:25 PM ET (US)
|
|
A little digging shows that this is dependent on having one of WinXP or Win2K - AND having a Unicode version of the application in question. An ANSI build simply can't support the Unicode characters, even if you have an input method for entering them.
For purposes of having the quotes at least come out correctly, there are plugins for Movable Type that coerce your entered material into being quote-correct.
I will still note, however, that nowhere in the experimenting of the last day has anything blown up on me. In most cases, I've had to use validation tools to find the problems, which don't even show up as problems in day-to-day use.
|
Aaron Swartz
|
43
|
 |
|
02-24-2003 12:56 PM ET (US)
|
|
I'm not a Windows users and I don't know the limitations, but I tried it on WinXP in Wordpad and it works fine.
|
Chris Smith
|
42
|
 |
|
02-24-2003 12:30 PM ET (US)
|
|
At least you knew better what I meant than any Windows app I can find. (Although Unicode might be 'offically' in hex, the decimal codes work fine in many situations, such as when generating HTML versions.)
I repeated your instructions in several common windows apps and all I get is "201c" in my text. Was this instruction for a specific application or version of the OS? Or some specific combination thereof?
|
Aaron Swartz
|
41
|
 |
|
02-24-2003 11:44 AM ET (US)
|
|
"Anbody know how you actually enter Unicode character 8220 in a Windows app?"
I assume you mean character 201C (Unicode counts in hex, not decimal). It's easy to get. Type 2, 0, 1, C, Alt-x
On OS X, switch to Unicode mode, hold down the option key, and type 2, 0, 1, C.
|
Chris Smith
|
40
|
 |
|
02-24-2003 11:29 AM ET (US)
|
|
People on email will have to come and look at the QuickTopic archive pages, because you're not getting the full effect on email. Email doesn't send out a new page after an edit happens. Sorry.
|
Chris Smith
|
39
|
 |
|
02-24-2003 11:07 AM ET (US)
|
|
Edited by author 02-24-2003 11:11 AM
Accurate and detail-obsessive notes about character rendering problems are unfortunately necessary. I've built systems that have to straddle the edge conditions between various layers, and have, on occasion, spent Weeks resolving problems that never should have been problems in the first place.
The key problem is that what looks like a curly quote - which I assume to mean using correct left and right single and double quotes - is not always that, nor is it always the same. The difficulty is compounded by the fact that ISO-8859-1 (one of the most common character sets) does NOT include the relevant quotes, but Windows-1252 does, largely by including them in the 128 through 159 space that is not specified in ISO code sets.
The relevant common layer for appropriate quotes is Unicode, but this is hampered by the fact that many operating systems do not have good, or standard, or any, method of accurately entering the relevant characters. Anbody know how you actually enter Unicode character 8220 in a Windows app?
The only knowingly safe way to do this for now is with the HTML character entities lsquo rsquo ldquo and rdquo (with ampersands and semicolons left off so you can SEE them). If you're cutting and pasting these, then you can expect to get real quotes from all apps. For now, anything else is luck of the draw.
In HTML browsers, they look like:
lsquo rsquo ldquo rdquo
Put it in quotes, he said. So I did.
P.S.#1 Just to prove my point, I accidentally screwed up the entities in the example. Grrr.
P.S.#2 But just to prove my point that this is a luck of the draw thing - QuickTopic converts my entities to quote marks, so that when I edit this, I DON'T see my entities, I see the quotes themselves. BUT - I am no longer sure about what they are. They might now be windows-1252 quote marks, and thus they will break (or at least not display correctly) on any system that does not support that character set.
|
cypherpunks
|
38
|
 |
|
02-24-2003 08:59 AM ET (US)
|
|
> Every single one of those apps has failed to correctly > render a curly quote generated by one or more other apps > for me, personally.
Then you're doing something wrong, because it Works For Everyone Else.
Would you mind pointing to the Mutt bug report you raised, Cory?
|
Aaron Swartz
|
37
|
 |
|
02-23-2003 04:00 PM ET (US)
|
|
Well, I'd like to hear about what curly quotes caused the problem.
There are two parties in the conversation: the generating and the receiving app. If all those apps are able to correctly render a correctly generated curly quote then it seems likely your problems were due to an incorrectly-generated one.
(Obviously if you speak Gzorgenplotz, no one will be able to understand you. That doesn't mean everyone's broken, it just means you should speak a different language.)
|
Cory Doctorow
|
36
|
 |
|
02-23-2003 03:36 PM ET (US)
|
|
You're wrong. Every single one of those apps has failed to correctly render a curly quote generated by one or more other apps for me, personally. The absence of problems for you does not indicate the absence of problems -- my car has never crahsed, so car crashes don't exist.
|
Aaron Swartz
|
35
|
 |
|
02-23-2003 02:56 PM ET (US)
|
|
Cory, let's get some facts straight:
Despite your claims, BBEdit, Mozilla, Movable Type, mutt, Google Groups, AppleWorks, Word, and every modern newsreader I've played with not only doesn't barf and die, it handles smart quotes and other Unicode characters beautifully. I don't know how you got the idea they didn't.
I don't have Mailsmith, Outlook, or Outlook Express but I suspect they work too. That leaves pine, elm, WordPerfect.
Programmers who care about their users know it's not acceptable to snub everyone who wants more than ASCII's 128 basic characters. Users who are content with them are in a very quickly shrinking minority.
|
Stephane
|
34
|
 |
|
02-22-2003 08:24 PM ET (US)
|
|
The thing I find the most interesting is that people are talking about the visual appearance of curly quotes or dashes, but not many are talking about the fact that's it's a grammatical error to put the wrong punctuation.
(Don't flame saying I've made error in that text, its possible, I'm French.)
|
language hat
|
33
|
 |
|
02-22-2003 10:21 AM ET (US)
|
|
This discussion is pissing me off at computer geeks in a way I haven't been pissed off since the days when they were making fun of people who couldn't code their own programs. What's wrong with you people? Are you so wrapped up in your codes that you've forgotten why they exist? The point is to be able to create on a screen whatever you want to create, and that includes decent-looking text. The same people who delight in creating monster programs to show off the latest Flash technology (that either crash most people's computers or take forever to load) also get impatient with users who simply want their text to look good. What the hell does it mean to say "the designers are theoretically right in the same way that communism is theretically the best possible economic system"? Are you trying to red-bait people who care about type? Jesus, by now we should be able to use any weird symbol ever created, let alone fucking em dashes; it boggles my mind that we're still clinging to ASCII. "I hate non-ASCII characters": do you realize how crazy that is coming from somebody in the 21st century who's supposedly at the cutting edge of technology? You might as well say "I hate anything other than ones and zeros"; that's all you really need, after all.
|
anildash
|
32
|
 |
|
02-22-2003 06:57 AM ET (US)
|
|
Of course you do, Anil, because you don't know what you're talking about.
Your momma.
Back on topic, I only mentioned Blogger because that's what's generating the RSS in this particular case. And, no question, lots of apps handle curly quotes incorrectly. But I think it's not unreasonable that most tools that generate RSS should either do a simple search and replace for funky characters or should accept input and output in an encoding scheme that can handle them.
Put another way, tools vendors know people are going to paste in funky quotes, why not simply throw in a couple of regular expressions to fix the problem before it happens?
|
Dean Allen
|
31
|
 |
|
02-22-2003 04:55 AM ET (US)
|
|
How much trouble is it to copy from a web page, paste in a BBEdit window, hit cmd-opt-t, then hit return? BBEdit converts all the high-ascii chars to numeric (or named or hex) entities automatically. Via applescript this would happen in a blink, and the resulting text won't break your RSS.
That said, I think dismissing the practical value of well-stated typographic meaning as misguided attempts at suave text - or hanging it all on Robin Williams and her silly hat - is plainly reductive and quite wrong.
|
Gruber
|
30
|
 |
|
02-22-2003 01:10 AM ET (US)
|
|
|
rusty
|
29
|
 |
|
02-22-2003 12:30 AM ET (US)
|
|
Zed: The Demoroniser is gloriously public-domain, and as far as I recall it was really easy to strip out the important bits and embed them into their own little scoop routine, so it ought to be equally easy to hack it into plugin form.
|
Michael Hanscom
|
28
|
 |
|
02-21-2003 11:02 PM ET (US)
|
|
Just a thought: wouldn't it be nice if the <q> tag worked the way it was supposed to? From what I remember of reading the HTML specs, you should be able to use the <q>quote</q> tag around your quotations and have it display the correct style of quotes, no matter what language you were using (solving the English/French problem), even switching between single- and double-quotes when they're nested, or switching among different styles of quotation marks by using the 'lang' attribute (i.e., a page in English that used a <q lang="fr"> tag would display that quote with the french »« tags). Individual users could even use a custom stylesheet to muck with the display of the quotes.
Of course, that's in a perfect world where everything works the way it's supposed to. I really need to stop smoking crack.;
|
M. Sean Fosmire
|
27
|
 |
|
02-21-2003 11:02 PM ET (US)
|
|
A practical workaround: use NoteTab. Its clipboard capture facility grabs the text and converts it to plain ol' ASCII, which it copies to an open text file. Then select, cut/copy and then paste to Blogger as the destination. Plain ol' quotation marks are pasted.
|
Zed Lopez
|
26
|
 |
|
02-21-2003 07:24 PM ET (US)
|
|
Thanks, Rusty. With MovableType 2.6's newfangled plugins, it oughta be no big thang to make a demoroniser plugin for MT to generate RSS feeds and simple HTML versions of pages. Maybe I'll take a shot at making the time to do that after the weekend...
Of course, if Captain Lazy Web has free time this weekend, I'll cheerfully use his/her version.
|
rusty
|
25
|
 |
|
02-21-2003 05:18 PM ET (US)
|
|
The Demoroniser is a little perl script that takes input with any manner of wacky shit in it and produces clean ASCII html output. Several chunks of it are actually resident inside Scoop to deal with copy-and-paste from non-ASCII sources, and it works very well. If you still have to fix these things manually, then your tools are failing you. This problem has been solved for at least three years. This comment is for those who just want to solve the problem and couldn't care less about the designer vs. coder slappy-fight (in which, incidentally, the designers are theoretically right in the same way that communism is theretically the best possible economic system).
|
Cory Doctorow
|
24
|
 |
|
02-21-2003 03:26 PM ET (US)
|
|
This isn't a Blogger issue. This is an issue with:
* BBEdit
* Mailsmith
* Outlook
* Outlook Express
* mutt
* pine
* elm
* Mozilla
* Movable Type
* Word
* AppleWorks
* WordPerfect
* Google Groups
* Virtually every newsreader, ever
All of which can barf and die if you paste curly-quotes into them.
|
Mark Kraft
|
23
|
 |
|
02-21-2003 03:23 PM ET (US)
|
|
Although I hate non-ASCII text, it's still Blogger's problem, in that it effects the usability of their product. I still don't know why RSS feed support would be so hard to fix and has taken so long, under the circumstnces.
|
fraying
|
22
|
 |
|
02-21-2003 02:22 PM ET (US)
|
|
Watching design snobs and code snobs fight is like watching the age old mac/pc, right/left, dog/cat debate. Different website creators have different priorities, different audiences, and will make different decisions. There is no one right way.
|
automaticmonkey
|
21
|
 |
|
02-21-2003 12:58 PM ET (US)
|
|
I think there's some resentment because using curly quotes still seems technically pretentious: "Please note how my blog observes proper typography conventions because I know and appreciate the distinction."
With that attitude the issue of supporting one character set vs another becomes right or wrong. Which it isn't for many people (mainly programmers) who consider it a feature.
I mean, it's not wrong if your computer only displays uppercase; it just leaves a lot to be desired.
|
Eli the Bearded
|
20
|
 |
|
02-21-2003 12:42 PM ET (US)
|
|
All computers handle capital letters, it is lower-case that some have problems with.
I don't care too much for curly quotes, but I do recognize the need for other things unicode has to offer. But DAMN if it isn't difficult to get all these character set and charset issues sorted out. And it isn't helped by MIME having been specified before the problem was well understood, and it isn't helped by the scarcity of people who well understand the problem, and it isn't helped by the lack of HTTP protocol level specifications about such matters.
|
language hat
|
19
|
 |
|
02-21-2003 10:40 AM ET (US)
|
|
What El Kabong said. It's ridiculous to blame people for wanting their text to look correct. Fake quotes and double hyphens look like shit, and (as even you admit) it's ridiculous not to be able to cite foreign words correctly. I understand your irritation, but don't blame people who are trying to do it right, blame the people who did the software wrong. If it couldn't handle capital letters, would you be raging against people who wanted to use them?
|
françois
|
18
|
 |
|
02-21-2003 08:04 AM ET (US)
|
|
A in ASCII is for American. I'm not an American, and in my language (French) quotes are of yet another sort (angle quotes « »). I hate when IT quircks get in the way of cultural conventions. When I write in English I try as much as I can to respect the English conventions, same with French. It's not easy, but it's just a matter of respect and style. This whole thing would have been easier if the inventors of ASCII had had a little knowledge of foreign languages beforehand.
|
Bryant
|
17
|
 |
|
02-21-2003 06:43 AM ET (US)
|
|
Yeah, well, that winds up being the question, Patrick (and that's what I'm musing over). My RSS feed, fwiw, doesn't use smart quotes. Right now my blog does. Is that the appropriate level of smart quotage? Maybe, maybe not.
The HTML version of Down and Out has smart quotes. Heh.
|
Jeff Suttor
|
16
|
 |
|
02-21-2003 02:39 AM ET (US)
|
|
Edited by author 02-21-2003 02:39 AM
"but the tools just aren't there yet" RSS is XML and well formed XML is easy, yes trivial to work with.
|
Aaron Swartz
|
15
|
 |
|
02-21-2003 12:00 AM ET (US)
|
|
OK, I did some more research, using the two discretionary hyphens currently on the top entry, and it seems the problem is that you're posting ISO-8859-1 when you should be posting UTF-8. I think this is a browser bug; is Xeni using Mozilla?
|
Aaron Swartz
|
14
|
 |
|
02-20-2003 11:44 PM ET (US)
|
|
Argh, QuickTopic doesn't support Unicode either. I'll have to bug Yost.
|
Aaron Swartz
|
13
|
 |
|
02-20-2003 11:43 PM ET (US)
|
|
Seriously though, if your computer, os, browser, and blogging software can't handle Unicode, there's something wrong. I use smart quotes not only because they're nice (and I read Bringhurst, not Williams) but also because they're one small step towards a world where everyone can speak their native language on the web.
? <-- Unicode Smiley Face
|
Aaron Swartz
|
12
|
 |
|
02-20-2003 11:39 PM ET (US)
|
|
Here's a nickel, buy yourself a real computer.
|
cypherpunks
|
11
|
 |
|
02-20-2003 11:37 PM ET (US)
|
|
Of course you do, Anil, because you don't know what you're talking about. Go teach yourself some programming skills and learn what character sets actually mean. _Then_ come back and blame blogger with an informed opinion.
|
T Bryce Yehl
|
10
|
 |
|
02-20-2003 11:01 PM ET (US)
|
|
|
anildash
|
9
|
 |
|
02-20-2003 10:14 PM ET (US)
|
|
I blame Blogger's RSS generation tools.
|
El Kabong
|
8
|
 |
|
02-20-2003 09:45 PM ET (US)
|
|
"I blame Robin Williams, the designer whose "Non-Designer's Design Handbook" convinced a generation of geeks that their type would look suave if it came with em-dashes and curly-quotes"
Gee, that's funny. I blame the moronic HTML creators who failed to recognize that HTML should comply with established language conventions rather than designating a widely used punctuation mark for a special technical purpose.
|
Cowboy X
|
7
|
 |
|
02-20-2003 09:34 PM ET (US)
|
|
cont'd...
And boy does Safari do some wacky stuff with entities in TEXTAREAs as well. If a web page loads a TEXTAREA with an entity like 8212; , Safari puts the character in place instead of the spelled out entity. Then when the form is submitted, the character gets posted instead of a spelled out entity. good data in == garbage out (MSIE and Mozilla do not exhibit this behavior)
|
Cowboy X
|
6
|
 |
|
02-20-2003 09:26 PM ET (US)
|
|
Edited by author 02-20-2003 09:30 PM
Heh, I love "typographically correct" punctuation. The key is to use a blogging tool that handles it intelligently, like the new Textpattern. I think Movable Type can convert oddball punctuation to character entities as well. [edit]Just read Nelson's rebuttal, and I understand Cory's point a little better. It's all fine on a web browser page, but when you start to move content to email or RSS newsreaders some pretty weird stuff DOES happen. I've been grappling with Newzcrawler's crummy handling of entities for a while, all I can figure is that it's just broken.[/edit]
|
Patrick Nielsen Hayden
|
5
|
 |
|
02-20-2003 09:14 PM ET (US)
|
|
But they don't mean _enough_ to be worth all the trouble.
Go ahead, let the best be the enemy of the good. I'll take my stand with the good. In other words, I agree with Cory.
|
ejs
|
4
|
 |
|
02-20-2003 09:03 PM ET (US)
|
|
It's funny, people using inch- and foot-marks in place of quotes are the bane of my print-reading existence. Rather than rage against "centuries-old typesetters' conventions," why not rage against the perpetuation of the low-tech hack that made this typographical abomination so widespread and acceptable to the masses, i.e. the typewriter? Ditching quotes because the current software can't handle them is the wrong approach-- rather, the software should be fixed. Not too long ago, computers couldn't display lowercase letters, but I don't recall anyone arguing we ditch them.
And as Bryant said, quotes and dashes are not just decoration. They actually mean something.
|
Paul Palinkas
|
3
|
 |
|
02-20-2003 08:57 PM ET (US)
|
|
Sorry, but it sounds as if today's WWW software and such isn't able to handle the full range of punctuation. 7-bit ASCII is primitive; why can't software makers fix this kind of crap instead of constantly moving ahead to the next bigger, faster thing.
I think fixing al of these niggling things is an important goal as we move forward into the "semantic web" or whatever the hell the future holds. (I'm typing this into a primitive text-entry window in a web browser despite the fact that even SimpleText, which is part of my OS, can do more.)
|
Bryant
|
2
|
 |
|
02-20-2003 08:43 PM ET (US)
|
|
I gotta say, I'm of the opinion that curly quotes and em/en dashes are not merely typographer conventions; rather, they carry a little bit of extra meaning. (Well, OK, they're both. It's not an either/or.)
I find that meaning valuable. I respect Cory's experience regarding the difficulty of dealing with them, and while I think that's a fault in the software rather than a fault in the concept, it is making me ponder their use on my blog.
But I definitely want to use them eventually. It's sort of like airline ticketing -- air travel doesn't stop being useful just because the reservation systems suck the wild moose dong.
|
Craniac
|
1
|
 |
|
02-20-2003 08:08 PM ET (US)
|
|
This sounds like a job for The Lazy Web!
Please post a link to an appropriate picture of Captain Lazy Web.
|