Tech Note: Public Key Cryptography

Four rules that are core to the use of public key cryptography and digital signatures:

  • When encrypting a message, use the recipient’s public key
  • When decrypting a message that you have received, use your private key
  • To digitally sign a message that you are sending to someone, use your private key
  • To verify the signature on a message sent to you by someone, use the sender’s public key

Getting Started with Phoenix

HDInsight HBase clusters have added Phoenix support.  Phoenix adds support for SQL queries on top of an HBase cluster.  It does this by compiling your SQL query into a series of table scans returning a regular JDBC result set.

What is Phoenix?

Apache Phoenix originated at Salesforce.com as an internal project to make it easier to work with big data systems, in patacular HBase a NoSQL database in the Hadoop ecosystem.  Phoenix enables OLTP and analytics for low latency applications by combining standard SQL and JDBC APIs with full ACID transaction capabilities with the scheme-on-read, late-bound capabilities of the world of No SQL.

The Phoenix framework provides both client and server libraries.  On the server, Phoenix provides custom HBase co-processors for handling indexing, joins, transactions, and scheme management.  All features that HBase doens’t provide on its own. On the Client side, Phoenix provides a library which manages parsing and query plan selection before interacting with the HBase API converting SQL into SCAN, PUT, and DELETE operations that execute server side on the Phoenix co-processors.

Phoenix is widely supported in a number of different Hadoop distribution platforms including Hortonworks, MapR, and Cloudera.

How to Connect

Phoenix supports connecting with a JDBC driver.  To connect to an HDInsight HBase cluster using Phoenix create a connection as follows:

The connection string of jdbc:phoenix: is all that is needed.  The ZooKeeper nodes will be pulled from the base.zookeeper.quorum property in the hbase-site.xml file if present.  You can, however; directly specify your ZooKeeper nodes in the connection string if needed.  For example:

Creating Indexes

Phoenix provides the capability to create secondary indexes on top of HBase.  Secondary indexes, unlike primary indexes may have duplicate values.  HBase does not natively support secondary indexes, leaving just the row key available for scanning.

Secondary indexes can be created on both tables and views.  Secondary indexes will be kept up to date automaticity as data in the table changes.  Phoenix supports different types of indexes: covered, functional, global, and local.

Global indexes are great for read heavy use cases.  The performance hit for managing the index is taken at write time (during UPSERT or DELETE).  Phoenix intercepts the data table updates on write to build the index to update all index tables.  At read-time Phoenix will select the index table to use which produces the fastest query-time.

Local indexes are better for write heavy use cases.  All local indexes of a table are stored in shadow column families in the same data table.  Because of this local indexes store data on the same server preventing any network overhead during writes.  This comes at the cost of some overhead at read-time as every region must be examined for the data since the exact region location of an index is not readily known.

Phoenix with SQLLine

HDInsight includes a helpful utility called SQLLine which is a simple shell for executing SQL commands against a database.  This is a great tool to use when exploring and playing around with Phoenix.  To access SQLLine:

  1. ssh into an HDInsight HBase cluster.
  2. cd /usr/hdp/2.2.9.1-7/phoenix/bin
  3. ./sqlline.py

After connecting to the SQLLine client you can execute commands against the Phoenix database.

Here are some helpful commands:

 

 

 

Azure hosting for the price of a coffee

If you are looking to get started with Azure, and your needs are simple, you can get by for the price of a single (fancy) coffee each month. In this post, you will learn how to host a simple Web application with a SQL Server back end.

Hosting a Web App

For hosting the Web application, create an Azure App Service.  App Service supports .NET, Java, Node.js, PHP and even Python.  The great thing about Azure App Service is the pricing starts at free. The free plan is limited to 60 CPU minutes a day, 1GB RAM, and 1GB disk space. This is perfect for development or a very low traffic web site. The free site has the additional limit of not being able to use a custom domain. If you are looking to take your web site to the next level you can move to the shared plan which will cost you approximately $9.67/mo.

SQL Server

When on a budget, you can’t go wrong with the Azure SQL Database. Prices start at just $4.98/mo for the Basic tier. This will give you 2GB of storage and 5 DTUs. Enough for development and testing your next big thing.

Summary

Wether you are looking to learn a bit about Azure, support developing a new app, or host a simple app you get started for around $5-15/mo depending on your needs.  (And yes, I’m pretty sure my Wife has paid $15 for a fancy coffee.)

More free stuff… For those completely new to azure, don’t forget Microsoft offers $200 in free credit to get you started.

Screen Shot 2017-07-09 at 10.33.50 PM

Connecting to HDInsight HBase from an Azure VNet

I’ve been working with HBase on HDInsight for some time.  This is a series of tech notes I’ve accumulated over that time.  This tech note will talk about connecting to an HDInsight cluster with the native client.

If you are working with HBase on HDInsight you have a couple of different options when connecting to the database from a client application.  In this tech note I will discuss connecting to HBase directly using the native HBase API.  To do this, the application must be hosted in the same VNet as the HDInsight cluster.

The native method for accessing data is through HBase Client.  At the time of this writing, I’m currently targeting HDInsight 3.5 which requires Java 8.

HDInsight 3.5 Maven Dependencies

The following dependencies are required to connect to the HDInsight cluster.    In addition, I’m referencing a resource file abase-site.xml in the build section. I’ll discuss the file in the following section.

Include the HBase-Site.xml file

The hbase-site.xml file contains zookeeper hostnames required by the client to make a connection.  To grab the hbase-site.xml file from the cluster follow these steps:

  1. Open a terminal and navigate to the project’s resource directory
  2. Run the following: scp username@hdinsighthost-ssh.azurehdinsight.net/etc/hbase/conf/hbase-site.xml hbase-site.xml

Summary

The configuration discussed in this tech note will allow a client application to connect to a HDInsight HBase cluster provided it is deployed and executed from the same Azure VNet.

HBase on HDInsight

I’ve been working with HBase on HDInsight for some time.  This is a series of tech notes I’ve accumulated over that time.  This introductory post will talk about what HBase is and how it is implemented on HDInsight.

HBase is an open-source NoSQL database based on Google BigTable.  It provides random access while still providing strong consistency for large amounts of unstructured and semistructured data.  The database is a column oriented database and it’s essentially schemaless requiring no more than table name and column family definitions.

When working with HBase on HDInsight, Azure provides a managed cluster configured to store data on Azure Storage in place of HDFS.  The cluster still provides direct support for MapReduce, Hive, and other Hadoop native tools even though the underlying storage is not HDFS.

HBase is a great tool for large data needs and can support many different use cases, including:

  • key value store
  • time series data – telemetry based streams
  • real time queries (including SQL support via Phoenix).

 

Reverse engineering Android apk files

I’m diving into some competitive analysis at work for which I had the need to reverse engineer an Android apk.  There are some interesting things you can glean from decoding and disassembling application resources.

The best tool I found for the job is ApkTool (https://ibotpeaches.github.io/Apktool/) an open source reverse engineering application for apk files.

The tool allows for both decompiling and recompiling of the apk file if needed.

To disassemble from terminal:

apktool d apkfile.apk

The output produces source in dex format.  Android’s version of byte code used by both Dalvik and ART.  If like me you are new to all this stuff you will need to review the Delvik bytecode format.  Here are some links to help you out