This linked blog post is written by a developer who was working on the MongoDB NoSQL data store then transitioned to working with MongoDB for another project.
NoSQL data stores are not the solution to every data storage problem! Using an SQL solution for structured data is vastly better than trying to shoehorn it into a NoSQL data store.
In this video, Lorenzo Alberton discusses the “what, why, and when” of using NoSQL solutions.
Video: NoSQL Databases: What, Why, and When (ontwik)
Two of my favorite topics, NoSQL and mobile apps, are covered in this awesome tutorial by Todd Anderson. The multi-part tutorial is linked to through the CouchBase blog post.
GigaOm is reviewing topics from last week’s Structure Big Data conference. One subject is that horizontal markets for big data analysis and visualization are quickly filling up with offerings. Those offerings may not be the right answers to the big data challenge but there is certainly a lot of competition in the space. The author of this post, “Why Big Data Startups Should Take a Narrow View,” advocates looking at vertical markets.
I agree with looking into vertical markets for a specific fundamental reason. Data exploration, analysis, and visualization generate value only when combined with domain-specific knowledge. Data visualizations in isolation may look interesting but value is only created when the information is acted upon. It is much easier to act on specific information created by big data from your own industry than across industries that are not pertinent to your line of business.
Disruptive innovation is occurring in the data storage industry. Some trends are evolutionary and help established businesses such as Oracle. For example, the predictable increase in data generation and corresponding rise in storage needs. Other trends are revolutionary and disruptive, such as the increasing importance of semi-structured and unstructured data sources.
Oracle is well positioned to take advantage of structured enterprise data growth needs as evidenced by their positive latest earnings report. Oracle will remain entrenched in the structured data storage market despite competition from open source offerings. Most enterprises believe their structured data is too critical to be stored in open source solutions such as MySQL and PostgreSQL.
However, as NoSQL solutions from firms such as 10gen (MongoDB) and Couchbase continue to improve their offerings Oracle will find profit margins decrease due to competition in semi-structured and unstructured data storage. The NoSQL solutions present disruptive innovation through revolutionary change in data storage methods. NoSQL is a different paradigm in data storage that fundamental computer science principles such as the CAP Theorem force design constraints that cannot be reconciled with traditional relational databases.
Structured SQL products will coexist with semi-structured and unstructured NoSQL products in the near term. Yet there is overlap in data storage needs that will eat into Oracle’s traditional relational database business. For example, storing, searching, and extracting unstructured CLOB and BLOB values in databases is a frustrating experience for developers and database administrators. NoSQL products make unstructured data storage much easier due to trade offs in the CAP Theorem.
SQL databases also scale differently with big data sets which forces mutually exclusive design decisions. There are two ways to scale: vertically and horizontally.
SQL solutions are geared towards vertical scaling while NoSQL solutions are usually built for horizontal scaling [1]. Oracle prefers vertical scaling because customers are more likely to buy expensive Sun Microsystems servers (which Oracle now owns). Companies without the funds for big servers or the inclination to buy dedicated hardware often prefer the horizontal scaling route because it’s easier to match supply of your web services with demand by customers through service providers such as Amazon Web Services.
Semi-structured and unstructured solutions will eventually compete in the structured data storage market. For example, EMC is not a traditional SQL database solution provider but it making major bets on big data and will soon be in direct competition with Oracle.
Oracle will compete against semi-structured and unstructured products with reduced motivation because it will cut into their server business. Oracle will be caught in a trap that may force it to move to the higher margin enterprise corner of the market where only structured data products exist.
Oracle could just buy out all of the NoSQL competitor companies. Disruptive innovation by commercial firms could be marginalized by Oracle’s really large war chest. Only open source NoSQL products would remain which would serve the needs of the majority of companies but enterprise customers could still be beholden to Oracle’s service guarantees similar to the situation today. However if a competitor refuses to be bought out or anti-trust issues are raised by limited competition in the data storage industry, Oracle could be left with a much smaller piece of the storage market than it holds today.
[1] I’m generalizing here so if you’re a subject matter expert you may take issue with this description. I recognize that and I appreciate feedback since I’m sure there’s a better way to succinctly state the difference in scaling between SQL and NoSQL solutions.
Cloudera has an interesting white paper entitled “Ten Common Hadoopable Problems.” The paper’s introduction has background information familiar to experienced Hadoop users. After the introduction, the paper presents problems and solutions for areas such as risk modeling, recommendations, and ad targeting.
The problem descriptions and solutions are high level, so don’t expect any MapReduce algorithms. The most useful part of the paper is that it is targeted primarily at business decision makers. The paper is particularly useful when promoting Hadoop in an organization as part of the solution to any of these common problems. One of the common problems I often face when discussing Hadoop with non-technical users is answering the question, “do you have any case studies on how it can solve my problems?” Apparently Cloudera has encountered the same question and now they’ve gone out of their way to make our lives easier by passing on these case study answers.
White paper: Ten Common Hadoopable Problems (Cloudera)
DataStax’s new product Brisk is a merger of Cassandra, Hadoop, Hive. While Apache HBase is the most common NoSQL data stores used with Hadoop, Brisk attempts to merge Cassandra’s distributed data stores with Hadoop’s MapReduce capabilities (and Hive’s job tracking tools). The general concept is that Hadoop can be run against one of Cassandra’s multiple distributed data store without impacting the performance of the other data stores.
Brisk is an interesting concept but the proof is in whether the product delivers the best of both Cassandra and Hadoop without transferring the weaknesses. Since Cassandra is eventually consistent (trading consistency for availability and partition tolerance in the CAP Theorem), how that impacts Hadoop’s MapReduce jobs remains an open question.
I’ve requested a copy of Brisk through DataStax’s website and when I get my hands on it I’ll create a further technical write up.
Link: DataStax Brisk
This amazing 100-second video of world events during human history was created by scraping all geo-tagged articles from Wikipedia. The visualization is obviously heavily based on Western world history since most English Wikipedia articles are written by contributors in the United States and Western Europe.
The video is worth watching for the final outline of the continents based on events alone.
I wrote the following piece four years ago while in my first job at Freddie Mac. The piece is advice I would give graduating college students in their first technology jobs.
Enthusiasm
Get excited about everything you do. Every job is an opportunity to become better. A manual maintenance task on a legacy system is a chance to help your colleagues by creating an automated fix. Simple programming assignments are an opportunity to learn how to rigorously unit test your solutions. If you work that hard on simple tasks your colleagues will want to work with you on harder challenges.
Enthusiasm is contagious but it is rarely found in the business world. Many of your coworkers are jaded by bad experiences and poor career choices. If you are seen by your colleagues as someone who takes pleasure in working hard and enjoys every day despite its numerous challenges, people will be drawn to you.
You have the ability to make the workplace enjoyable by being enthusiastic. One reason businesses hire students out of college is that recent graduates are uplifting. New graduates are untainted by years of working in average organizations. Keep a positive attitude and take on new challenges. Negativity and sarcasm are poor alternatives.
Optimism and enthusiasm go hand-in-hand. Being optimistic during periods when projects are running smoothly is easy. But optimism is more important during challenging times. Software engineering is a difficult field! There are a myriad of reasons why projects can fail. Ambiguous requirements, poor project plans, incorrect architectures, faltering project sponsorship, and team member attrition are a small subset of the major issues that can occur.
Yet technology projects do succeed. Success is controlled by the optimism of a project team. Success must always be viewed as an option during the most difficult times. Be a source of optimism and your colleagues will remember you as a valuable contributor. People want to be around others who are positive. If you exude a positive vibe, your colleagues will be eager to work with you again in the future.
As a recent college graduate you will not have the same depth of technical expertise nor the breadth of project experience as your colleagues. Enthusiasm makes up for the knowledge you lack today and is just as valuable to the success of your projects.
Here is a summary with links to my MS MIT retrospective posts:
This is the final part of my retrospective on the UVA Master’s in Management of Information Technology program I graduated from in May 2010. See part 1, part 2, part 3, part 4, and part 5 for context.
Graduation and Beyond
Graduation at UVA is a beautiful tradition. Go to it! It provides closure to the MS MIT experience.
After wrapping up Mod 4 and graduating it takes time to transition back into a normal work/life balance. Here are some tips and lessons to ease the transition:
Program Value
The MS MIT program provides most of its value by teaching business school topics in the context of technology. If you are in a technology field and are interested in MBA programs but do not want to study unrelated topics, the MS MIT program is a great choice.
The class is composed of a range of experience levels. My classmates had between 3 and 25+ years of experience and averaged 14 years of experience. There was roughly a 50/50 split between commercial and government positions (either consulting or employees).
A big portion of value in the program is gained by learning from your classmates. The program, particularly the Charlottesville section, facilitates interaction between classmates to assist the learning process outside of the classroom.
My One Criticism of the Curriculum
Overall I had a great experience with the MS MIT program. However, I was disappointed in one subject area. Technology and entrepreneurship is not covered in the curriculum. We learned a disproportionate amount about issues in large organizations. Many of the challenges we discussed in class were symptoms of big bureaucratic organizations.
For example, one of our case studies was on implementing a CRM system at a Fortune 500 organization. The political challenges were more of an issue than the technology problems. Smaller companies would not face the same challenges because they would be more likely to implement a standard software-as-a-service solution.
However, most of the subject matter in the program could apply to established (not startup) organizations. If you are looking to learn more about applying technology to create startups, there are other Master’s programs out there with more of an emphasis on that subject.
Conclusion
That’s my retrospective on the MS MIT program. In hindsight it was a great boost to my career and such a pleasure to have been a part of despite the heavy workload.
I hope this retrospective is helpful for prospective and current MS MIT students. If you have further questions or feedback, please email me at [email protected].
This is part 5 of my retrospective on the UVA Master’s in Management of Information Technology program I graduated from in May 2010. See part 1, part 2, part 3, and part 4 for context.
Mod 4
Mod 4 is by far the most difficult, intense, and valuable experience in the MS MIT program. Prepare for emotional ups and downs. One day you will have everything lined up with your company and course work then the next day your group has to scramble because a project deliverable was not acceptable.
Besides the Mod 4 Capstone Project where you work with a company, you will study corporate strategy, managerial accounting and finance, marketing, and behavioral event interviewing. After completing Mod 4, you will be able to hold your own against any top MBA student in corporate strategy discussions. We learned Michael Porter’s Five Forces and similar corporate strategy subject material. We read and discussed Harvard Business Review case studies for context. All of the material was grounded in technology subject matter. I felt the focused subject matter approach worked very well because we analyzed case studies that we had experience dealing with.
Accounting and finance was also beneficial. I never studied accounting or finance before the MS MIT program. Learning finance was difficult but I now have working knowledge of calculating discounted cash flows, net present value, weighted average cost of capital, and related concepts. Reading balance sheets, income statements, and statements of cash flows is important when working with publicly traded companies because you can fully understand and appreciate the business challenges they face.
I found the Behavioral Event Interviewing (BEI) class very beneficial. Every time I go into an interview and get asked BEI questions, I know how to structure my answers so they are appropriate. Studying BEI boosted my interview confidence because I know more about what the interviewer is looking for.
It’s easy to get lost in the Mod 4 class work because there is so much of it. But the capstone project is ultimately the biggest part of the grade. There are two major presentations in addition to written reports:
After the final presentation, the group returns on the last day to defend its work to the professors. Prepare to answer questions that time did not allow for after the presentation the previous day.
Advice for Mod 4:
My advice for adjusting to life after graduation, analysis of the MS MIT program’s value and conclusions are found in part 6.
This is part 4 of my retrospective on the UVA Master’s in Management of Information Technology program I graduated from in May 2010. See part 1, part 2, and part 3 for context.
Mod 2
Mod 2 covers managing information technology projects. A lot of the material is related to PMBoK (Project Management Body of Knowledge). Some people in the class earn their Project Management Professional (PMP) certification by taking the PMP test after finishing Mod 2. I cannot speak to how easy or difficult Mod 2 makes the PMBoK material since I did not take the test.
By far the most interesting part of Mod 2 is the group project. Groups are assigned by classmate geography to facilitate interaction. For example, I lived in Charlottesville at the time and my other four group members lived within a half an hour of me.
Groups are responsible for finding a completed (or failed) IT project and performing a retrospective. There is a laundry list of things that can go wrong on IT projects and groups have to analyze what went right, what went wrong, and what future projects can learn to do better.
Each group interviews project stakeholders, analyzes documentation and deliverables, and views demos of the system if it was completed. The analysis objective is to piece together the outcome and the project’s significant intermediate events. At the end of the three months, each group presents a retrospective on the project.
Things I wish I knew before going into Mod 2:
Mod 3
The main topics for Mod 3 are enterprise integration, data warehousing, and business intelligence. Although I found Mod 2 interesting, Mod 3 was where the classwork really became fascinating because it focused on enterprise-wide issues.
The Mod 3 topics are all major challenges that frustrate even the best IT organizations. Class discussions on enterprise integration and business intelligence were interesting because many of my classmates were working on these large projects. Professors provided best practices and case studies while classmates provided concrete examples.
The group project for Mod 3 comes from a list of choices on relevant issues in IT organizations, such as social networks, cloud computing, and the “data deluge” (how to process and make sense of the exponentially increasing amount of data organizations produce). As a side note, the data deluge and big data are the topics this blog usually focuses on so if you have further interest in that area, please check my archive for relevant posts.
Advice for Mod 3:
Mod 4 is introduced even before Mod 3 ends. Mentally prepare yourself for the most difficult part of the program.
My Mod 4 retrospective can be found in part 5.
This is part 3 of my retrospective on the UVA Master’s in Management of Information Technology program I graduated from in May 2010. See part 1 and part 2 for context.
Mod 0
May 2009 was the first weekend of my cohort’s program. It was more than just a meet and greet with classmates. Mod 0 set the tone for program weekends with three 8-hour class days.
Topics included an introduction to corporate strategy, IT relevance, and the program’s tag line, “Delivering business value through IT.” We learned that the program is about how IT works when done well. IT departments can be a critical piece of corporate strategy and not just a cost center.
If you leave Mod 0 feeling like you did not get any value out of the topics and discussions then you should consider dropping the program. It is a major commitment not only for yourself but also to your classmates who are paying $40,000 for an education. The cohort is only as strong as its weakest link. Everyone has to contribute for the program to produce maximum value.
Mod 1
Mod 1 is 10 days in June of rigorous class and project work. You get very little sleep. I slept 4 hours a night (2:30am to 6:30am) every day during Mod 1 and caught up as much as possible on the weekend break that divides the 5 day sections. Coffee was crucial.
Topics in Mod 1 included enterprise architecture, computer network fundamentals, computer and network security, database modeling, and an introduction to data warehousing (covered in further detail in Mod 3). The idea behind Mod 1 is to give non-technical students a grasp of technical fundamentals so the concepts are no longer intimidating. The topics are high-level. You will not be doing any Java or .NET programming.
If you are technical or have a technical background, the subject matter in Mod 1 is straightforward. You can fill in gaps in your knowledge or refresh areas you have forgotten. For example, I knew HTTP servers ran on port 80 but I forgot that a browser running on a client machine opens high number ports for communication to an HTTP server. I should have known that concept but had not thought about it in awhile so class corrected it for me. There were dozens of similar examples scattered throughout our coursework.
A few bits of advice for students about to experience Mod 1:
Mod 1 is a fantastic and intense experience. Go into it with the mindset of working incredibly hard the entire time and you can recover when the two weeks are over.
See part 4 for Mods 2 and 3.
This post is part 2 of my retrospective on the UVA Master’s in Management of Information Technology (MS MIT) program I graduated from in May 2010. Please read part 1 for context.
Here’s an overview of the 12 month program that takes place in Charlottesville, Virginia on UVA Grounds:
The curriculum changes each year to incorporate material on new technology and trends. The professors did a great job of keeping the material fresh and relevant to the latest news. Classes were engaging and a lot of fun despite the 8 hour length.
Part 3 covers my experience with Mods 0 and 1.
I graduated from the University of Virginia (UVA) Master’s in Management of Information Technology (MS MIT) program in May 2010. I worked full-time at Booz Allen Hamilton while going through the full-time 12-month program in Charlottesville, Virginia.
It’s been over 10 months since my group finished our program requirements by defending our capstone project to our professors.
A theme throughout the MS MIT program was performing retrospectives for iterative improvement. This six part series of blog posts are my retrospective for the entire Master’s program.
Here is a summary of what my posts explain:
Part 2 is an overview of the MS MIT curriculum
I just had to link to this post by Rich Aberman at WePay. The post covers both marketing and corporate strategy from the perspective of a start up going up against an established large organization. It’s a fantastic read.
Post: Picking a Fight With An 800 Pound Gorilla (WePay blog)
O'Reilly has some more great information on Google Public Data Explorer. The post shows an example of visualizing public data on the U.S. unemployment rate for the last 20 years. The post also has an embedded presentation with more information on how to upload data to Public Data Explorer.
Article: Google Public Data Explorer (O'Reilly)