Congratulations to my Excella colleague (and fellow JMU alum) Mike McGarr on the publication of his Java article in Linux User. Great job Mike!
Article: Java Lives
I’ve been using JQuery Mobile since roughly a week after its initial alpha 1 release back in October. I really like what I see so far and I wanted to share my initial impressions on both the alpha 1 and alpha 2 releases. Also keep in mind that some of the gripes I list could be due to my own lack of understanding while reading parts of the documentation since it’s such a new framework and besides the official forum there is not much information about JQuery Mobile on the web.
I tried JQuery Mobile because I was developing a web application for a Droid or iPhone browser. Many of the JQuery UI widgets did not work properly with mobile browsers. I looked for an alternative to either coding my own widgets or creating native Android and iOS applications.
It was easy to convert my custom mobile web application to the alpha 1 release, particularly because I was already using JQuery (although I did have to upgrade from the 1.4.3 release to 1.4.4). If you’re using YUI, Dojo, or ExtJS for your Javascript framework, you will have to decide whether to go purely the JQuery route or take a hybrid JQuery plus other framework route. For a mobile application the latter option is not advisable due to the additional overhead from two Javascript libraries.
While converting my application to JQuery Mobile alpha 1, I noticed that the standard user interface was polished with an appealing light blue color to most of the buttons. The standard theme can also be easily changed by specifying attributes on HTML elements.
While there were several missing widgets like the date picker and time picker, the buttons, list, slider, on/off switch, checkboxes, and text input fields gave me plenty to play with. It took about 4 hours including learning time to convert my roughly 20 page custom application to the alpha 1 release although I had to change my HTML page structure. The alpha 2 release still does not have the date picker and time picker, which is a deal breaker for many applications. If the January release contains these two widgets, it will help JQuery Mobile become a “production-worthy” Javascript user interface framework.
In the alpha 1 release, it was very difficult to create multiple page applications with separate HTML files. The paradigm was that your whole application should go in a single HTML file, which didn’t really work for me since I was using Django as my web application framework. While there were ways to separate HTML files, it felt like a hack in the negative sense. Alpha 2 fixed the problem by making it much easier to pull in separate pages. My application is now spread across a few dozen HTML files and I’ve had no further issues with loading separate files.
One of the biggest gripes I have with both the alpha 1 and alpha 2 release is the “step” attribute on the range slider is ignored. I really need a slider that goes from 5 to 20 in .1 increments but that’s not currently possible. So I’ve had to design the user interface to use text input instead. If the next release fixes the step increment issue, I will use the slider in the user interface.
Another issue I’ve come across is dynamically loading elements on the page. If you simply bring in an HTML element, it will not be styled properly until you explicitly call a refresh-like function on it. The lack of styling was particularly apparent when I tried to create a dynamic unordered list of elements. I ended up modifying the way I pulled in the elements with Django to obtain them with the initial page load instead of an AJAX call. The alpha 2 release addressed this issue but I kept the implementation the same so I do not know how well the new way works yet.
Despite these gripes, I overall really like JQuery Mobile and I’m looking forward to the next release scheduled for January. If you’re already comfortable with JQuery you will be very productive with JQuery Mobile after only a few hours of learning. I will update in the future with further impressions as I get to continue working on my custom mobile browser application.
Quora keeps coming up in my big data searches as a place to find well thought out answers to interesting questions. This Quora post has several answers on the best data blogs on the Internet. Many of the blogs are not specific to big data, but overall they provide a comprehensive overview of the ways businesses are currently using data.
Article: What are the best blogs on data?
The Consistency, Availability, and Partition Tolerance (CAP) Theorem is critical for understanding how NoSQL solutions trade off SQL database features we expect in order to achieve greater performance or fault tolerance. This article is a great overview on the subject and a must read for technical workers who want to cut through the hype on NoSQL solutions.
Article: CAP Theorem Overview
On the O'Reilly Strata website, Edd Dumbill describes three big data trends for 2011:
I agree with #1 and #3, but #2 is shaky to me. Real big data of tera-, peta-, exabyte sizes are simply too large to be analyzed in real time with today’s technologies. That is why batch processing systems are used. However, real-time data analysis of subsets of big data will continue to be important in e-commerce systems.
Another trend I would add for 2011: mainstream companies with great technology employees figuring out how big data can drive better understanding of their customers. Google, Facebook, Twitter, and similar technology companies already use big data, but more typical large corporations are just starting to grasp the concepts. Look for CIOs to become more inquisitive of how data can drive the business and generate revenue growth.
A person who “can obtain, scrub, explore, model and interpret data, blending hacking, statistics and machine learning.”
Leading companies with massive data sets are already exploring, understanding, and using information locked away in their data stores to drive their businesses. Look for this trend to move into the mainstream over the next several years as companies outside of the Internet and financial sectors realize that their proprietary data combined with public data sets can provide a greater understanding of existing and potential customers.
Article: Wanted: Data Scientists
This is a great blog post from Netflix on the mind shift they had to undergo while moving from their traditional data to Amazon Web Services. I found this paragraph particularly interesting:
One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.
That should be standard practice for any mission critical system but it’s the first time I have heard a company deliberately breaking pieces of their architecture to determine if the failures are handled properly.
Article: 5 Lessons Netflix Learned During Their Transition to Amazon Web Services
Andreessen Horowitz is investing $25 million in the open database service Factual. This comes after numerous other investments made in the big data space by this firm and others. Some firms, such as Cloudera, appear to have solid business models to build upon. What will be interesting is what new models come out of this space in the next several years.
LinkedIn has the most up to date data set about people’s current and former jobs in the world. That career information along with the connections among colleagues, provides a compelling project for the best data scientists to work with. Unsurprisingly, LinkedIn was able to recruit a top data scientist from Google to move the company’s initiatives ahead.
Johns Hopkins’ researchers are building a supercomputer optimized for peak Input/Output Operations Per Second (IOPS). This architecture is in contrast to the machines on the Top 500 supercomputer list that are designed for peak floating point operations per second (FLOPS). The increasing importance of big data will continue to divide supercomputers into different camps because the hardware required for each design is very different. IOPS designs require more storage per computing node as well as much greater network bandwidth between the nodes and racks. FLOPS designs require the faster processors[1] (including graphics processing units for GPGPU approaches) and extensive cooling to remove the heat generated from running processing units at maximum utilization for long durations.
There will ultimately need to be two Top 500 lists that measure IOPS computers separate from FLOPS computers.
[1] Note that the “fastest” FLOPS supercomputer processors are optimized for processing power per watt. Supercomputers often utilize simplified processor designs that are underclocked to around 1 gigahertz to minimize waste heat. 1 gigahertz is much less than the peak speed available on desktop computers of 3.4+ gigahertz.
I spent several hours last night trying to figure out why the Google Web Toolkit (GWT) project I created with Spring Roo would not work with an existing MySQL database I created with the Django object relational mapper. It turns out there were two reasons:
Here’s how I solved the problem:
ALTER TABLE table_name CHANGE id id BIGINT NOT NULL AUTO_INCREMENT;
Luckily I did not have dozens of tables to add the version field to. I would not call this an optimal solution. It would be much easier if a Spring Roo generated GWT project did not require a version field. However, I could not figure out if there is a way to modify the project after it is created to remove the need for the version field.
This also ties this particular Django project to MySQL, which isn’t great because I often use SQLite for development on my MacBook Air (since it only has 2 gigs of RAM). In theory you can change the primary key in the GWT project so it uses an Integer instead of a Long for the data type, but I was unable to get that working, so I went with the ALTER TABLE solution instead.