This Place is Taken

Saturday, May 23, 2015

Don't buy the Yureka !

 

Ok, this is not a public service agreement. I had the worse smartphone experience using the Yureka, YU India's first phone offering. I bought it last week on Amazon India's revised sale, and I just had them pick it up for returns today.

So here is the rundown, it is a really powerful device, if what you want is an electric cooktop or toaster. The battery is completely use-less, with no Wifi on, the battery runs down in under 5 hours. And with full Wifi turnd on , the battery ran down completely in under 2 hours. And no matter how much you re-charge the battery, it never reaches 100% charge. It is proof that a 100% energy conversion is not possible, as per the laws of thermodynamics.

But its not just the useless battery which let me down. One huuuge limitation of the system is that the CyanogenOS running on the system has no support for OTGs or even the plugged in SD cards. The on-board file manager app can read the SD card, but the MP3 player and the video player apps on the phone DO NOT recognize them. So,if you want to play your favourite tunes or videos, you have to go via the file manager and start playing them.

Anyway, I am in a way relieved I got rid of that thin, and that Amazon will issue me a 100% refund of the payment. Those guys at YU, they have a lot of work to do before they can ,or should,  release their next phone.

For me, its back to my trusted Nokia.

Friday, May 22, 2015

Xerox scanners/photocopiers randomly alter numbers in scanned documents

 

 

scanners / copiers of the Xerox WorkCentre Line randomly alter written numbers in pages that are scanned. This is not an OCR problem (as we switched off OCR on purpose), it is a lot worse – patches of the pixel data are randomly replaced in a very subtle and dangerous way: The scanned images look correct at first glance, even though numbers may actually be incorrect. Without a fuss, this may cause scenarios like:

  1. Incorrect invoices

  2. Construction plans with incorrect numbers (as will be shown later in the article) even though they look right

  3. Other incorrect construction plans, for example for bridges (danger of life may be the result!)

  4. Incorrect metering of medicine, even worse, I think.

The errors are caused by an eight (!) year old bug in widely used WorkCentre and ColorQube scan copier families of the manufacturer Xerox – according to reseller data, hundreds of thousands of those machines are used across the planet. As a result, anyone having used machines of the named families has to ask himself:

  • How many incorrect documents (even though they look correct!) did I produce during the last years by scanning with xerox machines? Did I even give them to others?

  • What dangers are imposed by such possible document errors? Is there a danger of life for someone?

  • Can I be sued for such errors?

 

Continue

Thursday, May 21, 2015

JAVA is 20

 

The JAVA programming language grows 20 years old this year.

Java at 20: The JVM, Java's other big legacy

May 23 marks 20 years since the first version of Java was released for public use. The timing of its arrival coincided with the advent of the web and the new role technology took in improving business productivity, streamlining business processes, and creating new ways for businesses and customers to interact.

The importance of a given programming language—especially one as pervasive as Java—in changing how people use technology is difficult to underestimate. The big data revolution, for example, is primarily a Java phenomenon.

In industry and business, most of server-side computing is done using Java applications. And much of the Internet of Things is also emerging on Java devices.

But 20 years ago, the language was delivered to an entirely different set of needs: a good, general-purpose language for desktop computing.

Java arrived at an important moment in software development. Up until then, the primary programming languages were few and well-established: Fortran in scientific computing, COBOL in business, and C or the emerging C++ everywhere else in commercial programming.

While less popular languages filled specific niches—Ada (defense), Pascal (hobbyists and consultants to SMBs), Smalltalk and Lisp (academia), Perl (system administrators), and so on—the Big Three dominated computing.

Fatigue with C

However, a fatigue with C was definitely emerging. The language had two major handicaps in those days: First, it was too low level—that is, it required too many instructions to perform even simple tasks. Second, it wasn’t portable, meaning that code written in C for the PC could not easily be made to run on minicomputers and mainframes.
The low-level aspects, which still are apparent today, led developers to feel that writing applications in C was akin to mowing the lawn with a pair of scissors. As a result, large software projects were tedious and truly grueling.

The portability of C was also a major problem. Although by 1995, many vendors had adopted the 1989 ISO standard, they all added unique extensions that made porting code to a new platform almost impossible.

It’s no coincidence, then, that this era saw the emergence of a new generation of languages. In 1995 alone, there appeared Ruby, PHP, Java, and JavaScript.

Java almost immediately became popular for mainstream programming due to its portability and large set of built-in libraries. The then-mantra for Java was “write once, run anywhere.” While not strictly true initially, it quickly became so, making Java a good choice for business applications that needed to run on several platforms.

IBM  subsequent embrace of Java (especially via Project San Francisco) clinched the new language’s central place in business programming.

Once a language becomes mainstream, it tends to have a long lifetime, as will be demonstrated this year when the languages born in 1995 all begin celebrating their twentieth  anniversaries. What makes Java stand out, though, is how much the language and platform have evolved in that time span.

Most conspicuous, to me at least, is the change in the Java Virtual Machine (JVM JVM). While it delivered portability almost from the start, it did not initially deliver speed. Java was known for being slow to start and slow to run.

Continual Improvements
Today, Java is among the fastest languages and can scale to programs that can process vast resources, as the big data revolution—a mostly Java-based phenomenon—has amply demonstrated.

The language, too has seen extensive revision. From a start in which there were rough corners lying here and there, Java has evolved into a tool that can address almost every kind of programming problem.

The advent of Java 8 in particular added important features taken from functional programming idioms that make code shorter, more reliable, and more expressive.

The details of Java’s history are so well known that it’s easy to forget how truly rare it really is. The rarity is that few languages have benefited from constant, large-scale engineering investment for two decades. Among major languages today, only Microsoft MSFT C# (and the .NET runtime) has been favored in this same way.

At one time, it was hoped that large communities of developers would be capable of driving this change by themselves. And certainly, the rapid pace at which early development tools advanced gave all programmers reason to believe. But those early tools turned out to be outliers, rather than heralds of coming things.

So, while others might celebrate 20 years of Java as if language endurance were in itself a major accomplishment, I prefer to celebrate the sustained rate of innovation and the 20 years of continuous investment required to make that happen.

java20

 

 

Oracle's Version

 

Other media:

 

Java has turned 20

The technology community is celebrating 20 years of the Java programming language, heralding its use by some nine million developers and the fact that it runs on seven billion devices worldwide.

The language was launched in 1995 by Sun Microsystems, and is now run as part of Oracle after the firm acquired Sun in 2010.

Georges Saab, vice president of development for the Java Platform Group at Oracle, explained that the Java programme has been one of the most important of the past two decades.

“Java has grown and evolved to become one of the most important and dependable technologies in our industry today,” he said.

“Those who have chosen Java have been rewarded many times over with increases in performance, scalability, reliability, compatibility and functionality.”

As part of the celebrations, Oracle has released a detailed timeline of the history of Java, starting as far back as 1991 and the background to its inception when it was called Oak.

Other technology giants that use Java, such as IBM and Fujitsu, have lined up to sing the praises of the platform, and executives from both firms noted its impact over the past 20 years and looked ahead to its future.

"IBM is celebrating Java's 20th anniversary as one of the most important industry-led programming platforms spanning mobile, client and enterprise software platforms,” said Harish Grama, vice president of middleware products at IBM Systems.

“IBM looks forward to the next 20 years of growth and innovation in the Java ecosystem, including mobile, cloud, analytics and the Internet of Things."  

Yasushi Fujii, vice president of Fujitsu's Application Management Middleware Division, said: “Fujitsu recognised the utility of Java in IT systems as soon as it first became available, and even now we are working to promote its applications.

"We expect that Java’s continuing evolution will lead to further ICT development and a changing society, and look forward to working with the Java community to develop Java technologies."

One company that is perhaps not going to join in the celebrations is Google, which is in the middle of a long-running $1bn patent battle with Oracle over the use of Java in the Android operating system.

Oracle has also faced criticism for its management of Java, specifically that it releases security updates for the software only every quarter, often leading to huge patch releases that can cause headaches for IT admins.

Nevertheless, Oracle said that its stewardship of Java since acquiring Sun has seen two major platform releases, Java 7 and Java 8, as well as the next release, Java 9, slated for 2016.

Java 9 is set to include a new feature called Project Jigsaw which aims to "modularise the platform" to make it scalable to a wider range of devices and easier for developers to build larger applications on the platform.

As part of the celebrations, Oracle is offering a 20 percent discount on all Java certification exams until 31 December.

 

 

 

Reload Original PagePrint PageEmail Page

Java At 20: The JVM, Java's Other Big Legacy

Think of Java, which celebrates its 20th anniversary this week, and your first thoughts most likely go to the language itself. But underneath the language is a piece of technology that has a legacy at least as important and powerful as Java itself: the Java virtual machine, or JVM.

Happy Birthday Java

20 years of Java

Because the JVM wasn't designed to run any particular language -- Java is only one of many possibilities -- it's become somewhat of a platform unto itself. Languages have been developed for the JVM that owe little or nothing to Java, and the future development of the JVM is turning more to empower the creation of new items that can leverage Java's existing culture of libraries and software or depart from it entirely.

The engines under the JVM hood

When people talk about the JVM, they're generally referring to a specific JVM: the one originally produced by Sun Microsystems and now owned by Oracle, which uses the HotSpot engine for just-in-time compilation and performance acceleration. With proper warmup time for long-running applications, code operating on HotSpot can sometimes meet or beat the performance of code written in C/C++.

Nothing says the HotSpot-empowered JVM has to be the one and only implementation of Java, but its performance and many years of development have made it the de facto choice for good reason. A galaxy of other JVMs have come (and gone), but HotSpot itself -- now an open source project -- remains the most common option for enterprise production use.

Here and there, though, others are attempting to become the keepers of their own JVM flame: One programmer, for instance, is developing a JVM entirely in Google's Go language -- although right now more as an experiment than as a serious way to give HotSpot any kind of competition.

Because of all the advanced optimization work put into HotSpot, the JVM has over time become a target platform by itself for other languages. Some are entirely new creations designed to exploit the JVM's high speed and cross-platform deployment; others are ports of existing language. Plus, using the JVM means devoting less work to creating a runtime for a language from scratch.

The big JVM stars: Clojure, Scala, and Groovy

Of the languages created anew on the JVM, one stands out for being most unlike Java as possible: Clojure, a functional language designed (in the words of its creator, Rich Hickey) to be a "kind of a Lisp for the JVM," one "useful anywhere Java is." Or even where Java isn't: Puppet Server, for example, recently swapped out Ruby for Clojure as its core language, citing performance as one reason for the switch.

Aside from its power as a functional language, Clojure illustrates one of the fringe benefits of creating a language for the JVM: access to all of the resources provided by Java itself, typically libraries like Swing or JavaFX. To that end, developers more comfortable with Clojure can write programs sporting platform-native UIs, by way of what Java already offers -- but without having to write Java code directly.

Scala, another functional language for the JVM, hews more closely to Java in terms of syntax, but it was created in response to many perceived limitations of Java. Some limitations, like the lack of lambda expressions, have been addressed in recent versions of Java. However, Scala's creators believe suchimprovements will leave developers wanting even more -- and Scala, not Java, will provide them in ways that developers will prefer.

Groovy, formerly stewarded by Pivotal but now an Apache Software Foundation project, was also developed as a complement to Java -- a way to mix in features from languages like Ruby or Python while still keeping the resulting language accessible to Java developers. It, too, functioned in part as a critique of Java by providing less-verbose versions of many Java expressions.

The JVM ports: Jython, JRuby, and the rest

Another side effect of the JVM serving as a language target: Implementations of several languages now run there as well. For example, if you thought Node.js was the first time JavaScript ran as a server-side entity, think again: Mozilla's Rhino has been doing so, in Java and on the JVM, since 1999 (albeit in only an open source variety after 2006).

Most prominent among the ported languages -- and relevant to enterprise developers -- are Python and Ruby, which have been implemented in JVMs as Jython and JRuby, respectively. As with the other JVM languages, hosting Python and Ruby on the JVM gives them access to the existing universe of Java software. This relationship works both ways: You can leverage Python from within Java applications as a scripting language for testing, by way of Jython.

Despite the speed of languages on the JVM, there's no guarantee that a JVM-ported version of a language will be higher-performing than its other incarnations. Jython, for example, is sometimes faster, sometimes slower than the conventional CPython implementation; performance depends greatly on the workload. Likewise, JRuby can be faster than its stock implementation, but not always.

Another disadvantage of a JVM-hosted version of a language: It doesn't always track the most recent version of the language. Jython, for example, supports only the 2.x branch of Python.

The next steps for the JVM

Even apart from performance issues, it's unlikely any of these languages will replace Java. But that has never been the plan -- after all, why replace Java when it's so widely entrenched, successful, and useful?

Instead, it's better to take the culture that's sprung up around Java -- all the libraries and applications -- and make it useful by way of the JVM to far more than Java programmers.

Next, the JVM must become easier to use as a development environment for forward-thinking language work. In 2014, Oracle unveiled Graal VM, a project that exposes the JVM's innards via Java APIs. When completed, this will allow programmers to create new languages for the JVM by using Java as a kind of command-and-control language. (Prototypes of JavaScript, Ruby, and R hosted with Graal showed promising, if inconsistent, results.)

Tougher to predict is whether the JVM or its successors can foster a new language that's as influential and broad as Java itself -- or whether such a language comes from another direction entirely.

JavaScript and the V8 engine for JavaScript are strong candidates as influential successors to Java. Node.js already has a culture of software reuse akin to Java's own, and languages that transpile to JavaScript allow use of the ecosystem without having to write JavaScript.

But with Java preparing for major makeovers, languages on the JVM seems far closer to the beginning of their journey than to the journey's end.

Wednesday, May 20, 2015

A Relevant Tale: How Google Killed Inktomi

 

 

 

On March 20th, 2000 Inktomi had a market capitalization of 25 billion dollars. As a relatively early employee, I was a multimillionaire on paper. Life was good. In the next year and a half the stock went down by 99.9%. In the end, Inktomi was acquired by Yahoo for 250M. What happened? Among other things, Google. Grab some popcorn and enjoy this story.

Inktomi was the #1 search engine in the world for a while. When I joined we had just won the Yahoo contract, and were serving search results for HotBot (there is still a search page there!) At first I worked on developing crawling and indexing tools written in C++. Our main goal at the time was to grow our index size, and at the same time to improve relevance. It became clear that as our document base grew, relevance would play a more important role. For ten million documents you may be able to filter out all but a handful of documents with a few well-chosen keywords. In that case any relevance algorithm would do; your desired result would be present in the one and only result page. You wouldn’t miss it. For a billion documents however, the handful would become hundreds or thousands. Without a good relevance algorithm, your desired result might be on page 17. You’d give up before getting to it.

At first we were using a classic tf-idf based model, enhanced by emphasizing certain features of pages or urls that correlated with “goodness.” For example, yahoo.com is probably more relevant to the query yahoo than yahoo.com/some/deep/page.html. We thought shorter urls were better. Of course this query was very popular, so spammers started creating pages stuffed with the word Yahoo. This was the beginning of an arms race that continues today. Back then we were the main target because we processed more searches than anyone else.

Inktomi_mug

Enter The Google

Yahoo had been complaining to us about not being result #1 for yahoo for a while. We fixed that special case, but we couldn’t do the same for many other sites or pages. In 1999 Google was gaining popularity because they were solving exactly this problem. We didn’t perceive them as a threat yet, but we did realize that we had to do our own version of PageRank. I was assigned to that task.

My small contribution to improving our relevance was coming up with a simple formula to take into account the occurrences of words in links pointing to pages. The insight was realizing that this followed a power law: at the time Yahoo.com had about 1M instances of the word yahoo in links pointing to it. Nobody else came close. Other Yahoo properties had an order of magnitude less, and then came a long tail of other sites. I decided to use the logarithm of the count as a boost for the word in the document. This wasn’t as sophisticated as PageRank (we’d get to that later), but it was a huge improvement. Our relevance got much better over time as other people spent countless hours implementing our own link analysis algorithms. We had a clear mandate from the execs; our priorities at search were:

1) relevance

2) relevance

3) relevance

Doug Cook built a tool to quickly measure the relevance effects of algorithmic changes based on precomputed human judgments. For example: it was clear that Yahoo.com was the definitive result for the query “yahoo” so it would score a 10. Other Yahoo pages would be ok (perhaps a 5 or  6). Irrelevant pages stuffed with Yahoo-related keywords would be spam, and humans would give them a negative score if they showed up for that query. Given ten results and a query, we could instantly evaluate the goodness of the results based on the human rankings.

We had a sample corpus of links and queries for which we could run this test as often as we wanted, and compare ourselves against Google. We did this for months until it became clear that we were “as good as Google.” Our executives were happy.

Relevance Is Only So Relevant

I thought about why I was using Google myself, and I’m sure it’s obvious to everyone now: theexperience was superior.

  • Inktomi didn’t control the front-end. We provided results via our API to our customers. This caused latency. In contrast, Google controlled the rendering speed of their results.
  • Inktomi didn’t have snippets or caching. Our execs claimed that we didn’t need caching because our crawling cycle was much shorter than Google’s. Instead of snippets, we had algorithmically-generated abstracts. Those abstracts were useless when you were looking for something like new ipad screen resolution. An abstract wouldn’t let you see that it’s 2048×1536, you’d have to click a result.

In short, Google had realized that a search engine wasn’t about finding ten links for you to click on. It was about satisfying a need for information. For us engineers who spent our day thinking about search, this was obvious. Unfortunately, we were unable to sell this to our executives. Doug built a clutter-free UI for internal use, but our execs didn’t want to build a destination search engine to compete with our customers. I still have an email in which I outlined a proposal to build a snippets and caching cluster, which was nixed because of costs.

Are there any lessons to be learned from this? For one, if you work at a company where everyone wants to use a competitor’s product instead of its own, be very worried. If I were an executive at such a company I would follow Yoda’s advice: “Do or do not. There is no try.” If you’re not willing to put in the effort to compete, you might as well cut your losses (like Google did with Buzz, for example).

Inktomi1