BRyan,<br> I don't really understand what you are proposing and if you posted an executive summary then it might prompt more of a discussion. I think you're saying is that you are going to implement something and then let us see the results, which sounds like a great way to better understand what it is you are proposing, <br>
cheers, <br>JOhn<br><br><br><br><div class="gmail_quote">On Wed, May 14, 2008 at 4:07 PM, Bryan Bishop <<a href="mailto:kanzure@gmail.com">kanzure@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hey all,<br>
<br>
I am not able to attend the week-day conference calls because of high<br>
school scheduling issues, but otherwise I've been meaning to suggest<br>
something to the group. I'll just send it here instead. :-)<br>
<br>
I wrote an email about this in the context of space-based manufacturing:<br>
<a href="http://heybryan.org/2008-05-09.html" target="_blank">http://heybryan.org/2008-05-09.html</a><br>
<br>
But let me try to put things into context. I know that OWW has a good<br>
representation of programmers around these parts, so when I reference<br>
debian, I'm hoping it's not entirely lost. Take a look here:<br>
<br>
<a href="http://debian.org/" target="_blank">http://debian.org/</a> and its Wikipedia article,<br>
<a href="http://ubuntu.org/" target="_blank">http://ubuntu.org/</a> And more concisely:<br>
"Debian is known for strict adherence to the Unix and free software<br>
philosophies. Debian is also known for its abundance of options — the<br>
current release includes over twenty-six thousand software packages for<br>
eleven computer architectures. These architectures range from the<br>
Intel/AMD 32-bit/64-bit architectures commonly found in personal<br>
computers to the ARM architecture commonly found in embedded systems<br>
and the IBM eServer zSeries mainframes. Throughout Debian's lifetime,<br>
other distributions have taken it as a basis to develop their own,<br>
including: Ubuntu, MEPIS, Dreamlinux, Damn Small Linux, Xandros,<br>
Knoppix, Linspire, sidux, Kanotix, and LinEx among others. A<br>
university's study concluded that Debian's 283 million source code<br>
lines would cost 10 billion USA Dollars to develop by proprietary<br>
means."<br>
<br>
"Ubuntu's popularity has climbed steadily since its 2004 release. It has<br>
been the most viewed Linux distribution on Distrowatch.com in 2005,[4]<br>
2006,[5] In an August 2007 survey of 38,500 visitors on<br>
DesktopLinux.com, Ubuntu was the most popular distribution with 30.3<br>
percent of respondents using it.[7] Third party sites have arisen to<br>
provide Ubuntu packages outside of the Ubuntu organization. Ubuntu was<br>
awarded the Reader Award for best Linux distribution at the 2005<br>
LinuxWorld Conference and Expo in London.[107] It has been favorably<br>
reviewed in online and print publications.[108][109][110] Ubuntu won<br>
InfoWorld's 2007 Bossie Award for Best Open Source Client OS.[111] Mark<br>
Shuttleworth indicates that there were at least 8 million Ubuntu users<br>
at the end of 2006.[112] The large user-base has resulted in a large<br>
stable of non-Canonical websites. These include general help sites like<br>
Easy Ubuntu Linux,[113] dedicated weblogs (Ubuntu Gazette),[114] and<br>
niche sites within the Ubuntu Linux niche itself (Ubuntu Women).[115]<br>
The year 2007 saw the online publication of the first magazine<br>
dedicated to Ubuntu, Full Circle.[116]"<br>
<br>
So, just what made these so successful? To the point where debian<br>
represents $10 billion USD of effort, all done by volunteer work?<br>
There's a bit more to mention:<br>
<br>
<a href="http://advogato.org/article/972.html" target="_blank">http://advogato.org/article/972.html</a><br>
<br>
"What are the issues? Why is it so important to go "distributed"?<br>
<br>
Debian is the largest independent of the longest-running of the Free<br>
Software Distributions in existence. There are over 1000 maintainers;<br>
nearly 20,000 packages. There are over 40 "Primary" Mirrors, and<br>
something like one hundred secondary mirrors (listed here - I'm stunned<br>
and shocked at the numbers!). 14 architectures are supported - 13 Linux<br>
ports and one GNU/Hurd port but only for i386 (aww bless iiit). A<br>
complete copy of the mirrors and their architectures, including source<br>
code, is over 160 gigabytes.<br>
<br>
At the last major upgrade of Debian/Stable, all the routers at the major<br>
International fibreoptic backbone sites across the world redlined for a<br>
week.<br>
<br>
To say that Debian is "big" is an understatement of the first order.<br>
<br>
Many mirror sites simply cannot cope with the requirements. Statistics<br>
on the Debian UK Mirror for July 2004 to June 2005 show 1.4 Terabytes<br>
of data served. As you can see from the list of mirror sites, many of<br>
the Secondary Mirrors and even a couple of the Primary ones have<br>
dropped certain architectures.<br>
<br>
<a href="http://security.debian.org" target="_blank">security.debian.org</a> - perhaps the most important of all the Debian<br>
sites - is definitely overloaded and undermirrored.<br>
<br>
This isn't all: there are mailing lists (the statistics show almost<br>
30,000 people on each of the announce and security lists, alone), and<br>
IRC channels - and both of those are over-spammed. The load on the<br>
mailing list server is so high that an idea (discussed informally at<br>
Debconf7 and outlined here later in this article, for completeness) to<br>
create an opt-in spam/voting system for people to "vet" postings and<br>
comments, was met with genuine concern and trepidation by the mailing<br>
list's maintainers.<br>
<br>
It's incredible that Debian Distribution and Development hasn't fallen<br>
into a big steaming heap of broken pieces, with administrators, users<br>
and ISPs all screaming at each other and wanting to scratch each<br>
others' eyes out on the mailing lists and IRC channels, only to find<br>
that those aren't there either.<br>
<br>
So it's basically coming through loud and clear: "server-based"<br>
infrastructure is simply not scalable, and the situation is only going<br>
to get worse as time progresses. That leaves "distributed<br>
architecture" - aka peer-to-peer architecture - as the viable<br>
alternative."<br>
<br>
In other words, it's the social structure and community around debian,<br>
the 26,000 software packages, and that incredibly easy command where<br>
you can grab *any* software package and have it immediately installed.<br>
It's from a software repository. Kind of like biobricks, except<br>
functional. By that I don't mean biobricks is dysfunctional, but that<br>
biobricks is about data, debian's apt is about software and<br>
functionality.<br>
<br>
This is what one of my projects focuses on - that sort of easy gradient<br>
by which not only programs and software can be downloaded, but open<br>
access information, and open source projects of any sort, whether from<br>
the Maker Communities, the diybio groups, debian, gentoo, etc.<br>
<br>
For a dense explanation:<br>
<a href="http://heybryan.org/exp.html" target="_blank">http://heybryan.org/exp.html</a><br>
<br>
The 'architecture' is really ridiculously simple, it's just putting<br>
together some components that have been out on the web for a while. For<br>
example, all wikis have a revision control system, even the mediawiki<br>
installation for OWW. These revision systems, though, existed long<br>
before wikis popped up, I am particularly interested in 'git'. And for<br>
this reason I am also interested in ikiwiki, which can be made to look<br>
exactly like mediawiki, except with the important difference that it's<br>
based on 'git' for the revision control / history. This means that<br>
pages can be branched and so on, by anybody interested.<br>
<br>
It also means that you're not just providing open access data, but also<br>
the entire project [if the researcher is interested in going that far,<br>
of course]. All of the files - source code, CAD, diagrams via dia or<br>
graphviz, SVG, documentation, latex-source of the papers, notes, etc.<br>
<br>
It's really easy to implement.<br>
<br>
It's an extension of "open access" and "open source" in that it makes<br>
the whole "semantic web" thing really truly functional, making it<br>
actually *do* something.<br>
<br>
And it's a useful way of doing research. What's the quote? The one from<br>
Gregory Wilson on bottlenecks in scientific computing?<br>
<a href="http://www.americanscientist.org/template/AssetDetail/assetid/48548" target="_blank">http://www.americanscientist.org/template/AssetDetail/assetid/48548</a><br>
<a href="http://www.cs.toronto.edu/%7Egvwilson/" target="_blank">http://www.cs.toronto.edu/~gvwilson/</a><br>
'figuring out how to make scientific programmers more productive'<br>
<br>
"Those Who Will Not Learn From History..."<br>
Beautiful Code<br>
"Requirements in the Wild"<br>
"DrProject: A Software Project Management Portal to Meet Educational<br>
Needs"<br>
"Software Carpentry"<br>
Data Crunching<br>
"Learning By Doing: Introducing Version Control as a Way to Manage<br>
Student Assignments"<br>
"Where's the Real Bottleneck in Scientific Computing?"<br>
"Extensible Programming for the 21st Century"<br>
"Open Source, Cold Shoulder"<br>
<br>
Anyway, the only thing left for implementation is changing up mediawiki<br>
a bit, writing some introductory tutorials [which I am doing anyway on<br>
another front], and then figuring out the file structure format (using<br>
YAML, so it's just writing classes in python), which frankly I think is<br>
something that individual researchers would be more suited to doing.<br>
For example, that's why we have the excellent Systems Biology Markup<br>
Language (<a href="http://sbml.org" target="_blank">sbml.org</a>), and I don't exactly have a broad enough overview<br>
of the field to make it happen.<br>
<br>
You get all of the benefits of software reuse, but with project reuse,<br>
with all of the sharing and acceleration of progress that the internet<br>
can allow for. So what are the general thoughts on this?<br>
<br>
- Bryan<br>
________________________________________<br>
<a href="http://heybryan.org/" target="_blank">http://heybryan.org/</a><br>
<br>
_______________________________________________<br>
OpenWetWare Discussion Mailing List<br>
<a href="mailto:discuss@openwetware.org">discuss@openwetware.org</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/oww-discuss" target="_blank">http://mailman.mit.edu/mailman/listinfo/oww-discuss</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>John Cumbers, Graduate Student<br>Molecular Biology, Cell Biology, and Biochemistry <br>Biology and Medicine, Brown University, Box G-W Providence, Rhode Island, 02912, USA<br>
Tel USA: +1 401 523 8190, Fax: +1 401 863-2166, UK to USA: 0207 617 7824<br><br>NASA Ames Research Center Mail Stop 239-20, Bldg N239 Rm 371 <br>Moffett Field, CA 94035<br><br>