joey's blog

Ubuntu 9.10 Coming Soon

Python Tip: Nested Lists for Irregular Data

Programming and working with multi-dimensional arrays is a common task for engineers. For example, say you have some data from 3 sensors, and you want to sample the data 10 times. That would naturally lead to a 10x3 array that can be indexed as: a[time][sensor].

In my case, I have a number of sensors that sometimes don't report their data. The wireless transmission often gets corrupted, preventing the data from reaching the basestation on my laptop. In this case, I could have 2 sensors that report 10 samples of data, but one that only reports 8. There's an irregularity in the structure now, and using a 10x3 array would require a bunch of overhead checking to make sure I don't count missing data as reported data.

Nested lists to the rescue! In python, a list is a sequence of objects that can be appended. So, to create an empty list:

data = []

Now we have an empty list called "data." Next, let's append three lists inside the list.

data.append([])
data.append([])
data.append([])

We now have a list of empty lists. Kind of like a book with empty chapters in the table of contents. Every time we append a new list, we add a new chapter, but no pages are there yet. Data now looks like this:

data = [[],[],[]]

Now, let's say sensor number 0 reports a measurement of 3. Then sensor 1 reports an 8, etc...

data[0].append(3)
data[1].append(8)

Continuing, let's say that sensors 0 and 1 never miss an incoming measurement, but sensor 2 is spotty. In this case, data[0] and data[1] are longer in length than data[2], but the nested list has no problem with this. If you want the average of sensor 0's measurements, you could do something like:

mu = mean(data[0])

Easy! Way better than trying to keep track of missing data. The only downside is that dynamic data structures are slower than pre-allocated length arrays. If you need your program to be screaming fast, appending to lists may not be your best choice.

FOLLOW UP: If you want to create a list of empty lists, you can use the following convenient code:
emptyLists = map(list,[[]]*100)
This will create a list of 100 empty lists.

Technology Commercialization: As Important as the Discovery

I often get asked why I am pursuing a PhD in electrical engineering. "Do you want to be a professor?" they usually ask. Frankly, the answer is no. Although I have one of the best advisors around, and I enjoy, appreciate, and understand the value of academic research, my aspirations are different. My dream is to come up with creative new ideas, then deliver them to everyday people in the form of products or services. I believe that our work as engineers is advanced only when we are taken for granted. Our users push the button, it works, and they usually don't know or care how.

We as engineers often view the commercialization of our work as something that should be left to others. We come up with the ideas, show that they work, then leave it to someone else to implement in product. We like to tell ourselves that "engineers are not technicians," and think we have more creative/interesting work to do. We usually believe that the most interesting and rewarding part of science is in the research, not the development.

As scientists and engineers, I believe we need to do a better job at understanding the importance of the implementation and development. I recently teamed up with a friend of mine who has a Master's in Business Administration (MBA) to write a business plan. We are developing a company to commercialize some of the technologies that I am working on as a PhD student, and a business plan is an important first step in making sure the company has a viable offering. As an engineer, I was tempted with thoughts like, "I should just let the business people take care of our plan. I'm an engineer, and I should focus on the technology itself, not the petty financial and marketing aspects."

To my surprise, I found that thinking and writing about the marketing of our proposed product amplified my technical abilities as an engineer. Suddenly, the problems with the current state of my technology were obvious, and problems that I thought were important to solve became worthless. New ideas and opportunities for the future of the technology were poured in to my thoughts, all of which could easily be the topic of research publications. I gained confidence that my research mattered to everyday people, and my motivation in my research work skyrocketed. I knew that solving the right problems would be extremely valuable.

If you're an engineer, don't blow off the applications and viability of your research to others. If you take some time to understand the business aspects of your research, you'll be much more sucessful as a scientist. You'll know what's important to solve, and what can be ignored. People will listen to you because they know that you see the big picture. Your job opportunities and skill sets will be amplified and expanded.

The University of Utah is a phenomenal school when it comes to commercialization of the new technologies developed by professors and students. Here are some links:

If you're at Utah, check these programs out, and watch for classes that relate. If you're not at Utah, find out what commercialization programs and classes your school has to offer. It's worth your time.

Utah Open Source Conference 2009

I am a big fan of open source software, as you can probably tell by reading my previous posts. I use Linux, OpenOffice, Inkscape, Drupal, and a ton of other open source applications on a regular basis. I love the fact that an entire community dispersed across the world can work on open source projects. Mark my words, 10 years from now, open source applications will be ubiquitous and preferred.

I'm excited for this year's Utah Open Source Conference coming up on October 8-10, 2009. It's a great time to meet other users of open source software and talk about how to make it better. Check it out, and if you live in Utah, sign up!

Follow-Up on Python vs. Matlab

A few weeks ago, I posted about my attempt to replace Matlab with Python as my preferred scientific/numerical/engineering/plotting software. This is simply a follow-up to share my thoughts after a few weeks of using Python exclusively.

I now prefer Python over Matlab, and will continue to use it for my engineering research, publications, and other work. Python is different from Matlab, so the change has slowed me down a bit while I learn some of the details, but I have really enjoyed the process. Python is elegant and powerful, and I have found that I can do almost everything that I used to do in Matlab. Some things I can do in Python that I can't in Matlab.

My main complaint is that the matplotlib (python plotting library) windows are not as interactive and flexible as Matlab's. The end-result plots are just as high-quality as matlab, it's just that the plot viewing tool is not as rich. This is a small price to pay, however, for such a powerful and friendly scientific computing tool. I'm sold.

I encourage all of you engineers out there to give Python, Scipy, and matplotlib a fair try. It's worth your time, and you won't regret it.

Python As a Replacement for Matlab

Python is an excellent alternative to Matlab for scientific and numerical computing that's free of charge. Matlab is certainly the de facto standard when it comes to numerical computing, but in most cases, Python is equally capable. In many instances, python is better. I've been playing with it over the past few days, and it is very possible that I will dump matlab completely in favor of Python. Time will tell.

Here's a description of Python from their website:
Python is a dynamic object-oriented programming language that can be used for many kinds of software development. It offers strong support for integration with other languages and tools, comes with extensive standard libraries, and can be learned in a few days. Many Python programmers report substantial productivity gains and feel the language encourages the development of higher quality, more maintainable code.

Python by itself is not enough to replace matlab, and that's where SciPy and Matplotlib come into play. SciPy is a python library that has a ton of functions for scientific computing. It's an extension of NumPy, which is a numerical computing library. Matplotlib is another package that allows you to plot things, just like in Matlab. It can do histograms, images, scatter plots, contours, and many more. It has the ability to save to PDF, EPS, PNG, PS, and even SVG if you want to edit your figures later with vector-based illustration software.

These figures were all created with Python and Matplotlib. For more, check out the gallery.

matplotlib examplematplotlib examplematplotlib example

Give it a try! It may take you a few days to get up to speed, but in my opinion, it's very worth the time. If you end up using these tools, I encourage you to donate to Python, SciPy, and Matplotlib so the people behind this great software can continue their work.

Ubuntu 9.04 Coming Soon

Get BibTeX or Other Citations with Google Scholar

Google has a special page for searching for academic publications called Scholar. This is a really handy tool for finding papers on any given subject, and it searches across multiple journals, conferences, and disciplines.

One great new feature that I just found out about is the ability for Google Scholar to generate BibTeX and other common bibliography entries. To enable this feature, do the following.

  1. Go to www.google.com/scholar
  2. Click on "Scholar Preferences"
  3. Scroll down to the bottom and look for "Bibliography Manager"
  4. Enable the citation links, and choose which format you would like to use

Now, when you find a paper that you would like to cite, you can just click on the link right from the Google search page and get the citation information. Great!

Token Passing Protocol Demo

Here at the SPAN lab, we often need to gather information from a wireless network with a large number of nodes. Synchronizing a wireless network is a very difficult problem, so we developed a protocol that allows each node in the network to have a turn at broadcasting. This prevents multiple nodes from transmitting at the same time, which would obviously cause problems for all of them to transmit their data. Our token passing protocol is robust to outside interference, and can be extended for self-healing and self-forming networks.

Here's a video of me explaining the protocol.

Even Engineers Can Dance

Neal and I conducted an experiment last week where we needed to know exactly where we were at a given time. The challenge was to keep both of us moving on a particular path, walking at the exact same speed. As we thought about how to do this, we had the idea that if we turned on some music and took pre-defined steps to the beat, we could keep ourselves in sync. Check it out:

Our next experiment will feature costumes and background singers/dancers.

Syndicate content