欧美日本中文字幕,久久久久国产精品人,亚洲二区免费

高納德訪談錄（轉(zhuǎn)載）

Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.

Andrew Binstock: You are one of the fathers of the open-source revolution, even if you aren’t widely heralded as such. You previously have stated that you released TeX as open source because of the problem of proprietary implementations at the time, and to invite corrections to the code—both of which are key drivers for open-source projects today. Have you been surprised by the success of open source since that time?

Donald Knuth: The success of open source code is perhaps the only thing in the computer field that hasn’t surprised me during the past several decades. But it still hasn’t reached its full potential; I believe that open-source programs will begin to be completely dominant as the economy moves more and more from products towards services, and as more and more volunteers arise to improve the code.

For example, open-source code can produce thousands of binaries, tuned perfectly to the configurations of individual users, whereas commercial software usually will exist in only a few versions. A generic binary executable file must include things like inefficient "sync" instructions that are totally inappropriate for many installations; such wastage goes away when the source code is highly configurable. This should be a huge win for open source.

Yet I think that a few programs, such as Adobe Photoshop, will always be superior to competitors like the Gimp—for some reason, I really don’t know why! I’m quite willing to pay good money for really good software, if I believe that it has been produced by the best programmers.

Remember, though, that my opinion on economic questions is highly suspect, since I’m just an educator and scientist. I understand almost nothing about the marketplace.

Andrew: A story states that you once entered a programming contest at Stanford (I believe) and you submitted the winning entry, which worked correctly after a single compilation. Is this story true? In that vein, today’s developers frequently build programs writing small code increments followed by immediate compilation and the creation and running of unit tests. What are your thoughts on this approach to software development?

Donald: The story you heard is typical of legends that are based on only a small kernel of truth. Here’s what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirth’s ALGOL W system (the predecessor of Pascal). My program didn’t work the first time, but fortunately I could use Ed Satterthwaite’s excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldn’t get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasn’t a fair contest.

As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

Andrew: One of the emerging problems for developers, especially client-side developers, is changing their thinking to write programs in terms of threads. This concern, driven by the advent of inexpensive multicore PCs, surely will require that many algorithms be recast for multithreading, or at least to be thread-safe. So far, much of the work you’ve published for Volume 4 of The Art of Computer Programming (TAOCP) doesn’t seem to touch on this dimension. Do you expect to enter into problems of concurrency and parallel programming in upcoming work, especially since it would seem to be a natural fit with the combinatorial topics you’re currently working on?

Donald: The field of combinatorial algorithms is so vast that I’ll be lucky to pack its sequential aspects into three or four physical volumes, and I don’t think the sequential methods are ever going to be unimportant. Conversely, the half-life of parallel techniques is very short, because hardware changes rapidly and each new machine needs a somewhat different approach. So I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.

Andrew: Vendors of multicore processors have expressed frustration at the difficulty of moving developers to this model. As a former professor, what thoughts do you have on this transition and how to make it happen? Is it a question of proper tools, such as better native support for concurrency in languages, or of execution frameworks? Or are there other solutions?

Donald: I don’t want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Itanium" approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I’ve written well over a thousand programs, many of which have substantial size. I can’t think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.[1]

How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that I’m wrong.

I know that important applications for parallelism exist—rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)

The machine I use today has dual processors. I get to use them both only when I’m running two independent jobs at the same time; that’s nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldn’t be any better off, considering the kind of work I do—even though I’m using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think it’s a pipe dream. (No—that’s the wrong metaphor! "Pipelines" actually work for me, but threads don’t. Maybe the word I want is "bubble.")

From the opposite point of view, I do grant that web browsing probably will get better with multicores. I’ve been talking about my technical work, however, not recreation. I also admit that I haven’t got many bright ideas about what I wish hardware designers would provide instead of multicores, now that they’ve begun to hit a wall with respect to sequential computation. (But my MMIX design contains several ideas that would substantially improve the current performance of the kinds of programs that concern me most—at the cost of incompatibility with legacy x86 programs.)

Andrew: One of the few projects of yours that hasn’t been embraced by a widespread community is literate programming. What are your thoughts about why literate programming didn’t catch on? And is there anything you’d have done differently in retrospect regarding literate programming?

Donald: Literate programming is a very personal thing. I think it’s terrific, but that might well be because I’m a very strange person. It has tens of thousands of fans, but not millions.

In my experience, software created with literate programming has turned out to be significantly better than software developed in more traditional ways. Yet ordinary software is usually okay—I’d give it a grade of C (or maybe C++), but not F; hence, the traditional methods stay with us. Since they’re understood by a vast community of programmers, most people have no big incentive to change, just as I’m not motivated to learn Esperanto even though it might be preferable to English and German and French and Russian (if everybody switched).

Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasn’t taken the whole world by storm. He observed that a small percentage of the world’s population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets.

Yet to me, literate programming is certainly the most important thing that came out of the TeX project. Not only has it enabled me to write and maintain programs faster and more reliably than ever before, and been one of my greatest sources of joy since the 1980s—it has actually been indispensable at times. Some of my major programs, such as the MMIX meta-simulator, could not have been written with any other methodology that I’ve ever heard of. The complexity was simply too daunting for my limited brain to handle; without literate programming, the whole enterprise would have flopped miserably.

If people do discover nice ways to use the newfangled multithreaded machines, I would expect the discovery to come from people who routinely use literate programming. Literate programming is what you need to rise above the ordinary level of achievement. But I don’t believe in forcing ideas on anybody. If literate programming isn’t your style, please forget it and do what you like. If nobody likes it but me, let it die.

On a positive note, I’ve been pleased to discover that the conventions of CWEB are already standard equipment within preinstalled software such as Makefiles, when I get off-the-shelf Linux these days.

Andrew: In Fascicle 1 of Volume 1, you reintroduced the MMIX computer, which is the 64-bit upgrade to the venerable MIX machine comp-sci students have come to know over many years. You previously described MMIX in great detail in MMIXware. I’ve read portions of both books, but can’t tell whether the Fascicle updates or changes anything that appeared in MMIXware, or whether it’s a pure synopsis. Could you clarify?

Donald: Volume 1 Fascicle 1 is a programmer’s introduction, which includes instructive exercises and such things. The MMIXware book is a detailed reference manual, somewhat terse and dry, plus a bunch of literate programs that describe prototype software for people to build upon. Both books define the same computer (once the errata to MMIXware are incorporated from my website). For most readers of TAOCP, the first fascicle contains everything about MMIX that they’ll ever need or want to know.

I should point out, however, that MMIX isn’t a single machine; it’s an architecture with almost unlimited varieties of implementations, depending on different choices of functional units, different pipeline configurations, different approaches to multiple-instruction-issue, different ways to do branch prediction, different cache sizes, different strategies for cache replacement, different bus speeds, etc. Some instructions and/or registers can be emulated with software on "cheaper" versions of the hardware. And so on. It’s a test bed, all simulatable with my meta-simulator, even though advanced versions would be impossible to build effectively until another five years go by (and then we could ask for even further advances just by advancing the meta-simulator specs another notch).

Suppose you want to know if five separate multiplier units and/or three-way instruction issuing would speed up a given MMIX program. Or maybe the instruction and/or data cache could be made larger or smaller or more associative. Just fire up the meta-simulator and see what happens.

Andrew: As I suspect you don’t use unit testing with MMIXAL, could you step me through how you go about making sure that your code works correctly under a wide variety of conditions and inputs? If you have a specific work routine around verification, could you describe it?

Donald: Most examples of machine language code in TAOCP appear in Volumes 1-3; by the time we get to Volume 4, such low-level detail is largely unnecessary and we can work safely at a higher level of abstraction. Thus, I’ve needed to write only a dozen or so MMIX programs while preparing the opening parts of Volume 4, and they’re all pretty much toy programs—nothing substantial. For little things like that, I just use informal verification methods, based on the theory that I’ve written up for the book, together with the MMIXAL assembler and MMIX simulator that are readily available on the Net (and described in full detail in the MMIXware book).

That simulator includes debugging features like the ones I found so useful in Ed Satterthwaite’s system for ALGOL W, mentioned earlier. I always feel quite confident after checking a program with those tools.

Andrew: Despite its formulation many years ago, TeX is still thriving, primarily as the foundation for LaTeX. While TeX has been effectively frozen at your request, are there features that you would want to change or add to it, if you had the time and bandwidth? If so, what are the major items you add/change?

Donald: I believe changes to TeX would cause much more harm than good. Other people who want other features are creating their own systems, and I’ve always encouraged further development—except that nobody should give their program the same name as mine. I want to take permanent responsibility for TeX and Metafont, and for all the nitty-gritty things that affect existing documents that rely on my work, such as the precise dimensions of characters in the Computer Modern fonts.

Andrew: One of the little-discussed aspects of software development is how to do design work on software in a completely new domain. You were faced with this issue when you undertook TeX: No prior art was available to you as source code, and it was a domain in which you weren’t an expert. How did you approach the design, and how long did it take before you were comfortable entering into the coding portion?

Donald: That’s another good question! I’ve discussed the answer in great detail in Chapter 10 of my book Literate Programming, together with Chapters 1 and 2 of my book Digital Typography. I think that anybody who is really interested in this topic will enjoy reading those chapters. (See also Digital Typography Chapters 24 and 25 for the complete first and second drafts of my initial design of TeX in 1977.)

Andrew: The books on TeX and the program itself show a clear concern for limiting memory usage—an important problem for systems of that era. Today, the concern for memory usage in programs has more to do with cache sizes. As someone who has designed a processor in software, the issues of cache-aware and cache-oblivious algorithms surely must have crossed your radar screen. Is the role of processor caches on algorithm design something that you expect to cover, even if indirectly, in your upcoming work?

Donald: I mentioned earlier that MMIX provides a test bed for many varieties of cache. And it’s a software-implemented machine, so we can perform experiments that will be repeatable even a hundred years from now. Certainly the next editions of Volumes 1-3 will discuss the behavior of various basic algorithms with respect to different cache parameters.

In Volume 4 so far, I count about a dozen references to cache memory and cache-friendly approaches (not to mention a "memo cache," which is a different but related idea in software).

Andrew: What set of tools do you use today for writing TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what do you use for the coding?

Donald: My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine, using the conventions of TeX. I use tex, dvips, and gv to see the results, which appear on my screen almost instantaneously these days. I check my math with Mathematica.

I program every algorithm that’s discussed (so that I can thoroughly understand it) using CWEB, which works splendidly with the GDB debugger. I make the illustrations with MetaPost (or, in rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some homemade tools, like my own spell-checker for TeX and CWEB within Emacs. I designed my own bitmap font for use with Emacs, because I hate the way the ASCII apostrophe and the left open quote have morphed into independent symbols that no longer match each other visually. I have special Emacs modes to help me classify all the tens of thousands of papers and notes in my files, and special Emacs keyboard shortcuts that make bookwriting a little bit like playing an organ. I prefer rxvt to xterm for terminal input. Since last December, I’ve been using a file backup system called backupfs, which meets my need beautifully to archive the daily state of every file.

According to the current directories on my machine, I’ve written 68 different CWEB programs so far this year. There were about 100 in 2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has an extremely convenient "change file" mechanism, with which I can rapidly create multiple versions and variations on a theme; so far in 2008 I’ve made 73 variations on those 68 themes. (Some of the variations are quite short, only a few bytes; others are 5KB or more. Some of the CWEB programs are quite substantial, like the 55-page BDD package that I completed in January.) Thus, you can see how important literate programming is in my life.

I currently use Ubuntu Linux, on a standalone laptop—it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux. Incidentally, with Linux I much prefer the keyboard focus that I can get with classic FVWM to the GNOME and KDE environments that other people seem to like better. To each their own.

Andrew: You state in the preface of Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely will comprise three volumes and possibly more. It’s clear from the text that you’re really enjoying writing on this topic. Given that, what is your confidence in the note posted on the TAOCP website that Volume 5 will see light of day by 2015?

Donald: If you check the Wayback Machine for previous incarnations of that web page, you will see that the number 2015 has not been constant.

You’re certainly correct that I’m having a ball writing up this material, because I keep running into fascinating facts that simply can’t be left out—even though more than half of my notes don’t make the final cut.

Precise time estimates are impossible, because I can’t tell until getting deep into each section how much of the stuff in my files is going to be really fundamental and how much of it is going to be irrelevant to my book or too advanced. A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.

From a scheduling standpoint, all I know at present is that I must someday digest a huge amount of material that I’ve been collecting and filing for 45 years. I gain important time by working in batch mode: I don’t read a paper in depth until I can deal with dozens of others on the same topic during the same week. When I finally am ready to read what has been collected about a topic, I might find out that I can zoom ahead because most of it is eminently forgettable for my purposes. On the other hand, I might discover that it’s fundamental and deserves weeks of study; then I’d have to edit my website and push that number 2015 closer to infinity.

Andrew: In late 2006, you were diagnosed with prostate cancer. How is your health today?

Donald: Naturally, the cancer will be a serious concern. I have superb doctors. At the moment I feel as healthy as ever, modulo being 70 years old. Words flow freely as I write TAOCP and as I write the literate programs that precede drafts of TAOCP. I wake up in the morning with ideas that please me, and some of those ideas actually please me also later in the day when I’ve entered them into my computer.

On the other hand, I willingly put myself in God’s hands with respect to how much more I’ll be able to do before cancer or heart disease or senility or whatever strikes. If I should unexpectedly die tomorrow, I’ll have no reason to complain, because my life has been incredibly blessed. Conversely, as long as I’m able to write about computer science, I intend to do my best to organize and expound upon the tens of thousands of technical papers that I’ve collected and made notes on since 1962.

Andrew: On your website, you mention that the Peoples Archive recently made a series of videos in which you reflect on your past life. In segment 93, "Advice to Young People," you advise that people shouldn’t do something simply because it’s trendy. As we know all too well, software development is as subject to fads as any other discipline. Can you give some examples that are currently in vogue, which developers shouldn’t adopt simply because they’re currently popular or because that’s the way they’re currently done? Would you care to identify important examples of this outside of software development?

Donald: Hmm. That question is almost contradictory, because I’m basically advising young people to listen to themselves rather than to others, and I’m one of the others. Almost every biography of every person whom you would like to emulate will say that he or she did many things against the "conventional wisdom" of the day.

Still, I hate to duck your questions even though I also hate to offend other people’s sensibilities—given that software methodology has always been akin to religion. With the caveat that there’s no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I’ve ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other’s code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.

Here’s a question that you may well have meant to ask: Why is the new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1? The answer is that computer programmers will understand that I wasn’t ready to begin writing Volume 4 of TAOCP at its true beginning point, because we know that the initialization of a program can’t be written until the program itself takes shape. So I started in 2005 with Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of Star Wars, which began with Episode 4.)

Finally I was psyched up to write the early parts, but I soon realized that the introductory sections needed to include much more stuff than would fit into a single fascicle. Therefore, remembering Dijkstra’s dictum that counting should begin at 0, I decided to launch Volume 4 with Fascicle 0. Look for Volume 4 Fascicle 1 later this year.
Footnote

[1] My colleague Kunle Olukotun points out that, if the usage of TeX became a major bottleneck so that people had a dozen processors and really needed to speed up their typesetting terrifically, a super-parallel version of TeX could be developed that uses "speculation" to typeset a dozen chapters at once: Each chapter could be typeset under the assumption that the previous chapters don’t do anything strange to mess up the default logic. If that assumption fails, we can fall back on the normal method of doing a chapter at a time; but in the majority of cases, when only normal typesetting was being invoked, the processing would indeed go 12 times faster. Users who cared about speed could adapt their behavior and use TeX in a disciplined way.

Andrew Binstock is the principal analyst at Pacific Data Works. He is a columnist for SD Times and senior contributing editor for InfoWorld magazine. His blog can be found at: http://binstock.blogspot.com.

////////////////////////

近日，Andrew Binstock和Donald Knuth對(duì)一些話題進(jìn)行了交流，包括開放源代碼運(yùn)動(dòng)，極限編程，多核架構(gòu)，可重用代碼，以及Knuth自己編程時(shí)使用的工具等。

Andrew：無論你當(dāng)初并是否意識(shí)到了，你其實(shí)都是開放源代碼運(yùn)動(dòng)的發(fā)起者之一。你以前就聲明過將TeX作為開放源代碼項(xiàng)目發(fā)布，原因在于考慮到當(dāng)時(shí)專有現(xiàn)實(shí)（proprietary
implementations）的問題，并且希望邀請(qǐng)更多的人來修改代碼中的錯(cuò)誤——這些都是今天開源代碼項(xiàng)目關(guān)鍵的驅(qū)動(dòng)力。你為開源項(xiàng)目從那時(shí)起的成功驚訝過么？

Donald：或許開放源代碼的成功是過去幾十年中計(jì)算機(jī)領(lǐng)域里唯一沒有使我驚訝的事。但是開源運(yùn)動(dòng)還沒有發(fā)揮出全部的潛力；我相信在整個(gè)產(chǎn)業(yè)經(jīng)濟(jì)越來越多地從產(chǎn)品轉(zhuǎn)移到服務(wù)上的過程中，開源程序?qū)⑼耆碱I(lǐng)支配地位，同時(shí)會(huì)有越來越多的志愿者會(huì)參與改進(jìn)代碼的質(zhì)量。

例如，開放源代碼能帶來數(shù)以千計(jì)的二進(jìn)制包，可以完美地針對(duì)每個(gè)獨(dú)立用戶進(jìn)行配置，但是商業(yè)軟件通常只有幾種版本。一個(gè)通用的二進(jìn)制可執(zhí)行文件必須包含很多指令來“抹平”不同運(yùn)行環(huán)境間的差異，這對(duì)于很多安裝環(huán)境來說并不合適；然而源代碼是高度可配置的，因此這種浪費(fèi)也就隨之消失了。這是開源軟件的巨大優(yōu) 勢(shì)。

但是我認(rèn)為有少數(shù)軟件，比如Adobe公司的Photoshop，夠能比它開源的競(jìng)爭(zhēng)者更優(yōu)秀（比如Gimp）——由于某些原因，我真的不知道為什么！我非常樂意花很多錢去買真正好的軟件，前提是我確認(rèn)它出自最佳的程序員之手。

不過請(qǐng)記住，我對(duì)于經(jīng)濟(jì)問題的觀點(diǎn)非常不可靠，因?yàn)槲抑皇莻€(gè)教育者和科學(xué)家。我對(duì)于市場(chǎng)幾乎一無所知。

Andrew：我聽說你曾經(jīng)參加過一次在斯坦福舉辦的編程大賽，你最終提交的那個(gè)獲獎(jiǎng)的作品，只經(jīng)過一次編譯就能正確運(yùn)行了。這件事是真的么？在軟件領(lǐng)域里，當(dāng)今的開發(fā)者需要頻繁地構(gòu)建程序，每次只增加很少量的代碼，過程中要依賴即時(shí)編譯技術(shù)，并且要?jiǎng)?chuàng)建并運(yùn)行大量單元測(cè)試。你怎么看待這種軟件開發(fā)方法？

Donald：你聽說的這個(gè)故事是一個(gè)典型的傳說，它只有一個(gè)很小的“內(nèi)核”是真的。事情的實(shí)際經(jīng)過是這樣的：John McCarthy于1971年決定舉辦一場(chǎng)“紀(jì)念日編程大賽（Memorial Day Programming Race）”。當(dāng)時(shí)除我以外的所有參賽者都工作于斯坦福后山的AI實(shí)驗(yàn)室中。他們共同使用WAITS分時(shí)系統(tǒng)；我對(duì)校本部很不滿，因?yàn)槲耶?dāng)時(shí)能夠使用的唯一的計(jì)算機(jī)是一臺(tái)大型主機(jī)，我必須為卡打孔，然后提交給主機(jī)，它會(huì)以批處理的模式進(jìn)行處理。我使用Wirth的ALGOLW系統(tǒng)（Pascal語言的前身）。我的程序第一次并不能工作，但幸運(yùn)的是我可以使用Ed Satterthwaite的那個(gè)出色的ALGOLW離線調(diào)試系統(tǒng)，因此我只是運(yùn)行了兩次程序。另一方面，使用WAITS的家伙們一直沒有得到足夠的處理器周期，因?yàn)樗麄兊臋C(jī)器負(fù)載太大了（我想，第二個(gè)完成的那個(gè)人，盡管使用了“現(xiàn)代的”方法，還是比使用老式方法卻最終取勝的我晚了一個(gè)小時(shí)）。這并不是一場(chǎng)公平的競(jìng)賽。

談到你剛才的問題，即時(shí)編譯和“單元測(cè)試”的想法對(duì)我的吸引力非常有限，尤其在我身處完全未知的環(huán)境中時(shí)，這時(shí)我需要反饋，以確定什么可以工作，什么不可以。否則，我會(huì)在不需要做的事情和不必思考的問題上浪費(fèi)很多時(shí)間。沒有什么是需要“偽裝（mock up）”的。

Andrew：對(duì)于開發(fā)者，尤其是客戶端開發(fā)者，正在面臨一個(gè)日漸明顯的問題，即改變他們的思考方式，從線程的角度去編寫程序。這個(gè)問題是由廉價(jià)的多核 PC的出現(xiàn)導(dǎo)致的。它一定需要很多算法進(jìn)行多線程化的改造，至少也需要做到線程安全的。到目前為止，從你已經(jīng)發(fā)布的《計(jì)算加程序設(shè)計(jì)的藝術(shù)》（TAOCP）第4卷的大部分內(nèi)容來看，還沒有涉及到這方面內(nèi)容。你接下來的工作會(huì)和并發(fā)與并行程序設(shè)計(jì)有關(guān)么？尤其是這個(gè)問題天生就與你現(xiàn)在研究的組合算法非常適合。

Donald：組合算法的研究領(lǐng)域非常龐雜，而我將有幸在三或四卷書中介紹它串行方面的內(nèi)容，我認(rèn)為串行方法的重要性不會(huì)降低。相反，并行技術(shù)的“半衰期”其實(shí)非常短，因?yàn)橛布傇诳焖俚刈兓恳粋€(gè)新的機(jī)器都需要一些不同的方法。所以，很久以前我就決定在書中保留我最了解的內(nèi)容。有很多人比我更了解并行機(jī)器，他們可以指導(dǎo)你如何面對(duì)同時(shí)性的問題；程序員應(yīng)該聽聽他們的建議，而不是我的。

Andrew：很多多核處理器的供應(yīng)商都在幫助開發(fā)者轉(zhuǎn)移到多核模型的過程中，表現(xiàn)得力不從心。做為一名著名的教授，你對(duì)于這種轉(zhuǎn)變有什么看法？什么因素才能促使這種轉(zhuǎn)變？如果有更好的工具可以解決問題么，比如在語言中加入對(duì)并發(fā)更好的本地支持，或者使用框架？或者還有其他的方案么？

Donald：我不想回避你的問題。也許我個(gè)人的一些觀點(diǎn)會(huì)為當(dāng)前流行的多核架構(gòu)趨勢(shì)潑一盆冷水。在我看來，這種現(xiàn)象或多或少是由于硬件設(shè)計(jì)者已經(jīng)無計(jì)可施了導(dǎo)致的，他們將Moore定律失效的責(zé)任推脫給軟件開發(fā)者，而他們給我們的機(jī)器只是在某些指標(biāo)上運(yùn)行得更快了而已。如果多線程的想法被證明是失敗的，我一點(diǎn)都不會(huì)感到驚訝，也許這比當(dāng)年的Itanium還要糟糕——人們基本上無法開發(fā)出它所需要的編譯器。

這么說吧：在過去的50年間，我編寫過一千多個(gè)程序，其中有很多規(guī)模都很可觀。但是如果說哪些程序的性能可以在并行或多核環(huán)境下有明顯的改進(jìn)，我恐怕連五個(gè)都說不來。比如，多核處理器對(duì)TeX肯定沒有什么幫助。

你聽說過有多少程序員對(duì)這種未來一片光明的機(jī)器抱有強(qiáng)烈的興趣？我?guī)缀鯖]有聽說過，除了他們的訴苦。盡管我們學(xué)院那些搞硬件的家伙一直想讓我相信我是錯(cuò)的。

我知道有很多重要的應(yīng)用依賴于并行——圖形渲染、密碼破解、圖像掃描、物理與生物過程模擬等等。但是這些應(yīng)用需要非常專業(yè)的代碼以及特定用途的技術(shù)，而這些技術(shù)無疑每隔若干年都要變化。即使我對(duì)那些方法非常了解，可以把它們寫入TAOCP中，這對(duì)于我的時(shí)間也是巨大的浪費(fèi)，因?yàn)檫^不了多久，這部分內(nèi)容就沒有什么價(jià)值值得別人去讀它了。（類似地，當(dāng)我在準(zhǔn)備第3卷的第三版時(shí)，也打算刪除掉關(guān)于如何在磁帶上排序的內(nèi)容。這些內(nèi)容曾經(jīng)是軟件領(lǐng)域里最熱門的主題，現(xiàn)在再把它印在書中就是巨大的浪費(fèi)了。）

我今天所用的機(jī)器有兩個(gè)處理器。而我只有在同時(shí)運(yùn)行兩個(gè)獨(dú)立的作業(yè)時(shí)，才會(huì)用到這兩個(gè)處理器；這樣很好，不過每周這種情況只會(huì)發(fā)生幾分鐘而已。如果我有四個(gè)、八個(gè)甚至更多的處理器，我同樣得不到任何好處，想一想我是做什么的——我?guī)缀趺刻烀繒r(shí)每刻都在使用計(jì)算機(jī)。所以，我為什么要為硬件供應(yīng)商承諾的未來而高興？他們認(rèn)為多核的到來可以為我的工作提速，我認(rèn)為這是“白日夢(mèng)”（pipe dream）。（不——這個(gè)比喻不準(zhǔn)確！我是會(huì)用“Pipeline”的，但是不會(huì)用線程。也許我應(yīng)該說這是個(gè)“泡影（bubble）”）

不過，我認(rèn)為Web瀏覽器可能會(huì)由于多核的出現(xiàn)而有所改變。我是從技術(shù)性的工作，而不是消遣的角度在說。我也承認(rèn)我還沒有什么很好的想法，到底硬件設(shè)計(jì)者應(yīng)該給我們除多核以外什么樣的產(chǎn)品，不過他們現(xiàn)在在串行計(jì)算上的確遇到了麻煩。（但是我的MMIX設(shè)計(jì)中包含了很多考量，可以有效地提高我最關(guān)注的一些程序當(dāng)前的性能——代價(jià)是與遺留的x86程序無法兼容。）

Andrew：軟件開發(fā)領(lǐng)域中有一個(gè)很少被提及的問題：面對(duì)全新領(lǐng)域的軟件，如何進(jìn)行設(shè)計(jì)？當(dāng)你著手開發(fā)TeX時(shí)，應(yīng)該遇到過這樣的問題：沒有現(xiàn)成的源代碼可以參考，而且在這個(gè)領(lǐng)域中你也不是專家。你是如何完成設(shè)計(jì)的？在你順利地進(jìn)入編碼階段以前，設(shè)計(jì)工作花了多少時(shí)間？

Donald：這又是一個(gè)好問題！我在《Literate Programming》一書的第10章、以及《Digital
Typography》一書的第1章和第2章中已經(jīng)詳盡的探討了問題的答案。我相信任何對(duì)這個(gè)問題真正感興趣的人都將樂于閱讀這些章節(jié)。（還可以參考《Digital Typography》的第24和25兩章，它包含了我在1977年做的TeX初始設(shè)計(jì)的第一版和第二版草稿。）

Andrew：關(guān)于TeX的書和程序自身都清楚地表現(xiàn)了對(duì)有限內(nèi)存使用的考量——這是那個(gè)年代系統(tǒng)的一個(gè)重要問題。今天，對(duì)內(nèi)存使用的關(guān)注則更多地與緩存大小有關(guān)了。每當(dāng)有人設(shè)計(jì)出了新的處理器，你一定會(huì)關(guān)注“可感知緩存”和“緩存透明”的算法的問題。在你最近的工作里，你會(huì)談?wù)摚蛘咧皇情g接地介紹一下，處理器緩存在算法設(shè)計(jì)中所扮演的角色么？

Donald：在前面我提到過，MMIX提供了實(shí)驗(yàn)的平臺(tái)，可以在上面嘗試多種不同的緩存。它是由軟件來實(shí)現(xiàn)的機(jī)器，因此我們可以執(zhí)行一些實(shí)驗(yàn)，它們從今天起在100年內(nèi)都是可重復(fù)的。可以肯定的是，在新一版的TAOCP 1-3卷中，將會(huì)討論在不同的緩存參數(shù)下，基本算法的不同行為。目前在第4卷中，我大約收集了十多個(gè)緩存內(nèi)存和緩存友好的方法（更不用說“memo cache”了，這是一個(gè)與軟件相關(guān)但又不同的概念）。

Andrew：你在編著TAOCP時(shí)都用到哪些工具了？你使用TeX？LaTeX？CWEB？Word？你在編程的時(shí)候使用哪些工具？

Donald：我通常的工作方式是用鉛筆和紙先把所有東西都寫下來，然后在旁邊放一個(gè)大廢紙簍。然后我使用Emacs將所有文本鍵入到機(jī)器中，當(dāng)時(shí)要使用 TeX風(fēng)格。我使用TeX、dvips和gv查看結(jié)果，它們幾乎可以瞬時(shí)顯示出來。我使用Mathematica檢查數(shù)學(xué)運(yùn)算的結(jié)果。

我使用CWEB編寫每一個(gè)經(jīng)過討論的算法（這樣我才能充分地理解它），CWEB和GDB調(diào)試器簡(jiǎn)直是天作之合。我使用MetaPost制作插圖（或者，在極少數(shù)的情況下，會(huì)在Mac上使用 Adobe Photoshop或Illustrator）。我有一些自己創(chuàng)作的工具，比如我自己的TeX拼寫檢查器和內(nèi)置在Emacs的CWEB。我自己設(shè)計(jì)了在 Emacs中使用的位圖字體。我還有一些特殊的Emacs模式（mode），可以幫助我對(duì)文件中的好幾萬份論文和筆記進(jìn)行分類；特定的Emacs快捷鍵使得寫書的過程有點(diǎn)兒像演奏風(fēng)琴。我喜歡用rxvt作為終端輸出的窗口。從去年12月開始，我使用了一個(gè)名為backupfs的文件備份系統(tǒng)。我每天都需要對(duì)每一個(gè)文件進(jìn)行歸檔，backupfs非常適合我的需要。

根據(jù)我機(jī)器上當(dāng)前的目錄來看，今年我已經(jīng)寫了68個(gè)不同的CWEB程序。其他的，2007年有100個(gè)左右，2006年90個(gè)，2007年100 個(gè)，2004年90個(gè)。而且CWEB有一個(gè)非常方便的“改變文件”機(jī)制，這樣我可以快速地為一個(gè)主題創(chuàng)建多個(gè)版本和修改了；在2008年，我目前為止已經(jīng) 在這68個(gè)主題上創(chuàng)建了73個(gè)修改。（有幾個(gè)修改非常短，僅有幾個(gè)字節(jié)；其他則達(dá)到了5KB甚至更多。有些CWEB程序非常重要，比如我完成于一月前的 BDD包，它有55頁。）因此，你可以看出文法式程序設(shè)計(jì)對(duì)于我有多么重要。

我現(xiàn)在正在一臺(tái)獨(dú)立的Laptop上（沒有網(wǎng)絡(luò)連接）使用Ubuntu Linux。我偶爾會(huì)使用閃存驅(qū)動(dòng)在Ubuntu Linux和Macs之間搬運(yùn)數(shù)據(jù)。我用后者接入網(wǎng)絡(luò)和處理圖像；但是我相信我的“傳家寶”只有Linux而已。順便提一句，相對(duì)于GNOME和KDE環(huán) 境，我更習(xí)慣把注意力留在鍵盤上，這樣我可以能夠使用經(jīng)典的FVWM。不過似乎有很多人更喜歡GNOME和KDE。每個(gè)人都有不同的愛好。

Andrew：在你的站點(diǎn)上，你提到在Peoples Archive上最近制作了一系列視頻，在其中你回顧了過去的生活。在第93段“給年輕人的建議”中，你提醒人們不應(yīng)該只是簡(jiǎn)單地因?yàn)槟臣聲r(shí)髦就去做它。有一點(diǎn)我們都心知肚明，軟件開發(fā)比起其他學(xué)科，在產(chǎn)生時(shí)髦技術(shù)上有過之而無不及。你能舉出兩個(gè)當(dāng)前正在流行的技術(shù)么？對(duì)于這些技術(shù)，也許開發(fā)者不應(yīng)該簡(jiǎn)單地因?yàn)樗鼈儺?dāng)前很流行，或者他們正在使用，就欣然接受它們。你是否愿意再舉幾個(gè)軟件開發(fā)范圍以外的例子呢？

Donald：這個(gè)問題非常矛盾。我通常都是建議年輕人要相信自己的判斷，而不是其他人。我就是“其他人”中的一員。大概每一位你要效仿的“偉大人物”的傳記上都會(huì)記載，他或她曾經(jīng)向當(dāng)時(shí)的“傳統(tǒng)智慧”發(fā)起過挑戰(zhàn)。雖然如此，我并不想回避這個(gè)問題，盡管我也不想觸動(dòng)其他人敏感的神經(jīng)——有一種軟件方法學(xué)已經(jīng)類似于某種宗教了。首先聲明：沒有任何人有任何理應(yīng)該聽信我這種計(jì)算機(jī)科學(xué)家/數(shù)學(xué)家對(duì)于軟件開發(fā)的種種評(píng)論。我要說的是，幾乎每一件我聽說過的與術(shù)語“極限編程”相關(guān)的事情，看起來都絕對(duì)是錯(cuò)誤的...除了一點(diǎn) 例外。這個(gè)例外就是“團(tuán)隊(duì)工作”和“互相閱讀源代碼”的思想。這個(gè)想法非常關(guān)鍵，甚至可以彌補(bǔ)極限編程中其他匪夷所思的方面。那些令我很不安。

我還必須承認(rèn)我對(duì)“可重用代碼”的流行保有強(qiáng)烈的偏見。對(duì)我來說，“可重編輯的代碼”要遠(yuǎn)遠(yuǎn)勝于一個(gè)無法觸及的黑盒或工具集。就這個(gè)問題我還可以不斷地說下去。如果你對(duì)可重用代碼深信不疑，我可能絲毫無法動(dòng)搖你，但是你也無法讓我相信，可重用代碼并不總是麻煩制造者。

還有一個(gè)問題你可能會(huì)問到：為什么新書的名為“第4卷分冊(cè)0”，而不是“第4卷
分冊(cè)1”？答案是：計(jì)算機(jī)程序員將會(huì)明白——我還沒有準(zhǔn)備好從真正的起點(diǎn)開始寫TAOCP的第4卷，因?yàn)槲覀冎啦荒苜Q(mào)然決定程序的初始設(shè)定，除非程序已經(jīng)自身已經(jīng)基本成形。因此在2005年，我開始編寫第4卷的分冊(cè)2，而在此之前已經(jīng)有了分冊(cè)3和4（想想星球大戰(zhàn)，它是從第4部開始的）。

最后，我做好心理準(zhǔn)備，可以編寫前面的部分了，但是不久我又意識(shí)到簡(jiǎn)介部分就需要包含很多內(nèi)容，它更適合成為一個(gè)獨(dú)立的分冊(cè)。因此，還記得 Dijkstra的名言么，計(jì)數(shù)應(yīng)該從0開始，于是就有了第4卷的分冊(cè)0。今年晚些時(shí)候第4卷的分冊(cè)1可以和大家見面。

posted on 2011-05-25 20:16 艾斯維亞閱讀(532) 評(píng)論(0) 編輯收藏引用

只有注冊(cè)用戶登錄后才能發(fā)表評(píng)論。
【推薦】100%開源！大型工業(yè)跨平臺(tái)軟件C++源碼提供，建模，組態(tài)！



網(wǎng)站導(dǎo)航: 博客園 IT新聞 BlogJava 博問 Chat2DB 管理

isware

高納德訪談錄（轉(zhuǎn)載）

導(dǎo)航

隨筆檔案

文章檔案

links store

software

內(nèi)核

隨便轉(zhuǎn)轉(zhuǎn)

搜索

最新評(píng)論

閱讀排行榜