Wednesday, November 30, 2016

Modern Day Messaging Patterns

Its been a few days now and I have been focused on understanding modern day messaging patterns for a problem I am trying to solve. I do know that there are existing server side tools like Active MQ, Rabbit MQ and even WMS that can do the trick and already have pre-defined patterns tested and validated for performance and security but in this case even though I am not trying to reinvent the wheel in terms of creating a new pattern or any of these server side products, I am definitely trying to understand the manner in which these products have been created and if I can actually leverage some of the principles in a server side application I am writing up. For example in modern based web application development, if .Net based, you have patterns like the one's defined here: Microsoft SOA patterns that do the neat tricks you would need. Man at times I feel I am going at 300 miles an hour without any crash guards: Code reviews, Custom product development, Customer Engagements, Team management, generating pipelines for my practices, reviewing and helping the team in analyzing problems, Big Data Workloads, Machine Learning,  my data science degree  etc... have consumed majority of my life for the past 2.5 years. Thank goodness I am going to be taking an extended break early next year for my brother's wedding. Here are the gist of things I am trying to solve though:

1. Dynamic reformatting of my messages
2. Integration with Machine Learning to collaborate with specific scientic models (pre created)
3. Templated messaging with dynamic parameterization
4. Distribution channel modification
5. AI Hub and Spoke messaging relay

Wednesday, October 19, 2016

Power BI To Embed Or Not To Embed

It is very critical for organizations to work & play with data. Power BI - the reporting solution from Microsoft is literally scorching the market with its rapid pace in usage. On a quick note while interacting with your Power BI report like the following-->

This report is accessible by the public. In order to create a more personalized/advanced security reporting structure with Power BI, the Power BI embedded would be the way to go. Create a workspace collection in Azure and then generate the required API keys (two by default - primary and secondary). These API keys will be leveraged by your web application. Once this is done create the pbix solution file in your desktop tool and publish or import the pbix solution to the Azure workspace using powershell/C#/ruby/java etc... Now to interact with the pbix file in your application, you need to leverage the Power BI Embed API's. However there is another approach using Power BI API's instead of the embed API's. The embed API's is a pay as you go service where you are using the Page views determining the pricing and the other option. The manner in which an embed Power BI report acts vs a straight forward Power BI report is that you are registering the Power BI workspace in the case of an embedded report .
 But in a straight forward Power BI report, you will be publishing the web application in which the Power BI report is consumed using the Power BI API’s to Azure. The core difference is that there will be more development required here vs the previous method.

Wednesday, June 15, 2016

Microsoft acquires LinkedIn

On Monday 6/13/2016, Microsoft announced its acquisition of LinkedIn. This is a major game changer in the world of IT. But before we get to some of the advantages of this acquisition, Microsoft actually was working on a LinkedIn killer on its CRM dynamics platform. The idea was to generate more footprint for its CRM solution as well as create something unique with it. This was started in early 2012 and was way before its actual acquisition of LinkedIn. Here are my thoughts into where this acquisition will lead Microsoft & LinkedIn to:

  • Microsoft gains a huge database of professionals and organizations in various streams: This alone is the most massive gain by Microsoft. It could start targeting professionals/organizations to either move onto the Microsoft platform or join the Microsoft platform which can bolster its sales by a huge margin/
  • Microsoft integration of LinkedIn ads with Bing: Just imagine an organization trying to establish a marketing campaign. Now with LinkedIn ads and Bing ads integrated, an organization will have more opportunities to get page views or clicks and the potential to accelerate CTR's to conversions. This could come with a potential increase in cost of a campaign but it might make a really profitable decision.
  • Changes in technology trends at Microsoft will accelerate: Now since the foundation of LinkedIn is based on Cloud computing and Big Data, LinkedIn would have significantly made a lot of strides in terms of architectural and open source technology aspects. Now if these can be converted to products on the Azure suite, Microsoft might make a sizable profit on it. Also this will increase the trend of embracing open source vs closed technology stacks. (Azkaban, Voldemort, Increased usage of Kafka and Rabbit MQ etc...)
  • Added stream of Revenue: During an interview with Satya and Jeff, Satya did mention the integration with O365 and Azure as the major driver and Jeff mentioned that it made sense for LinkedIn to sell at this point. But I think the potential driver for this deal was the significant monetary gain that Microsoft will have in terms of Ads and revenue generated from LinkedIn subscribers. LinkedIn had major competition coming its way in the form of more local or regional based corporate social websites, but with this it actually gives it a major edge over its competitors due to Microsoft's global reach. Azure and O365 could potentially outlets for Microsoft to create LinkedIn based apps similar to Yammer
  • Allow organizations to create internal social platforms using the technology gained by LinkedIn: Microsoft could potentially make a configurable LinkedIn app for all its devices inclusive of the XBOX that organizations can tap into and create internal social networking platforms.

It would be fun to just do a prediction as to where this acquisition would lead Microsoft. probably a story to tell another day.

Tuesday, May 24, 2016

R - Notes

The following are basically my notes while studying R and is meant as a reference point for myself
Just a few pointers to anyone preparing for R or studying R:
  • Take a quick look at your statistical math basics before proceeding
  • Before applying any formula on your base data, try to understand what the formula is and how it was derived (this will make it easier for one to understand)
  • Use it in tangent with the Data Analysis in Excel
  • Refer to the cheat sheets available on
  • Segregate the workbench for each module
  • There are best practices that can be incorporated while programming in R
  • Try and jot notes when and where one can... 
  • Refer to existing data-sets embedded in R before jumping into a file
  • Refer to R programs written already in Azure ML

rnorm() by default has mean 0 and variance 1
head() has its own built in precision
*default settings in R can be modified by the options() function
options(digits = 15)
#will display 15 digits (Max digit for option display --> 22 and min digit --> 0): Error if > 22 --> Error in options(digits = 30) :
#invalid 'digits' parameter, allowed 0...22

#Infinity Operations
Inf/0 --> Inf
Inf * 0 --> Inf
Inf + 0 + (0/0) --> NaN
Inf + 0  --> Inf

*The ls() lists all the variable stored in R memory at a given point in time
*rm() will remove contents from the list

*To figure out the commands in R use the following command ? followed by the function that needs to be leveraged:

*Functions and Datastructures

*Again single valued functions and multi valued functions

*A special vector is called a factor
gl() --> generate levels

*creating a function in R
test<-function p="" x="">
return (x*x+(x^2))

*for loop in R

l*apply() vs sapply()

*Binding elements
rbind() --> bind elements in a matrix in a row manner
cbind() --> bind elements in a matrix in a columnar manner

*Every vector/matrix has a data mode....

*Can be found using mode()

*dimensions in matrices
=defines the number of rows and columns in a matrix

*can be used with dimnames(),rownames(),columnnames()

*Navigating through R package libraries really bad....

*HMISC --> Harrell misc... Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX code, and recoding variables.

*R search path is the R working directory

getwd() --> get working directory

*to read in a table format:
testfile <- filename="" p="" read.table="">
read.fwf (fixed width file)

scan()--> reads a content of a file into a list or vector

f*ile() connections can create connections to files for read/write purposes
f1<-file p="">
close(f1)--> close the file connection

base::sink                Send R Output to a File
dput() --> save complicated R objects (in ASCII format)
dget() --> inverse of dput()

*file in conjunction with open="w" option
R has its own internal binary object
use save() & load() for binary format

*RODBC Package
Common Functions

*specify the version of the driver TDS_Version=8.0 and which port to use default:1433.
query<- from="" p="" selet="" t1="" t2="" test="">
check dimensions of a table using dim()

*summary() -> gives a range of stats on the underlying vector,list,matrix

Which function should you use to display the structure of an R object?

Log(dataframe) to investigate the data

Calculate Groups


Convert to frequency using prop.table()

Simulations in R
MCMC (Markov Chain Monte Carlo)
Performance Testing
Drawback --> Uncertainity

Pseudo Random Number Generator - The Mersenne Twister
Mersenne Prime


Uniform distribution - runif(5,min=1,max=2)
Normal distribution - rnorm(5,mean=2,sd=1)
Gamma distribution - rgamma(5,shape=2,rate=1)
Binomial distribution -rbinom(5,size=100,prob=.3)

Multinomial Distribution - rmultinom(5,size=100,prob=c(.2,.4,.7))

eruption.lm = lm(eruptions ~ waiting, data=faithful)
coeffs = coefficients(eruption.lm)
waiting = 80           # the waiting time 
duration = coeffs[1] + coeffs[2]*waiting 
duration --> Predicted value

loadd ggplot2 or ggplot using load("gplot")

Compare models using ANOVA
X1 <- nbsp="" span="" style="font-family: 'Lucida Console', 'courier new', monospace; font-size: 13px; line-height: 19.5px;">lm(y ~ x1 + x2 + x3 + x4, data=mydata)
Y1 <- lm="" span="" x1="" x2="" y="">
anova(X1, Y1)

Saturday, February 13, 2016

Hadoop Installation on Win 10 OS

Setting the Hadoop files prior to Spark installation on Win 10:
1. Ensure that your JAVA_HOME is properly set. A recommended approach here is to navigate to the installed Java folder in Program Files and copy the contents into a new folder
you can locate easily for eg:- C:\Projects\Java.
2. Create a user variable called JAVA_HOME and enter "C:\Projects\Java"
3. Add to the path system variable the following entry: "C:\Projects\Java\Bin;"
4. Create a HADOOP_HOME variable and specify the root path that contains all the Hadoop files for eg:- "C:\Projects\Hadoop"
5. Add to the path variable the bin location for your Hadoop repository: "C:\Projects\Hadoop\bin" <Keep track of your Hadoop installs like C:\Projects\Hadoop\2_5_0\bin>
6. Once these variables are set, open command prompt as an administrator and run the following commands to ensure that everything is set correctly:
A] java
B] javac
C] Hadoop
D] Hadoop Version
7. Also ensure your winutils.exe is in the Hadoop bin location.
< Download the same from ->
8. Also an error might related to the onfiguration location might occur -Add the following to the hadoop-env.cmd file to rectify the issue:
set HADOOP_PREFIX=C:\Projects\Hadoop

9. Another issue that I did face while leveraging Hadoop 2.6.0 install was the issue with the hadoop.dll. I had to recompile the source using MS VS to generate the hadoop.dll and pdb files and replaced the hadoop.dll which came along with the install.
10. Another error that I faced was "The system cannot find the batch label specified - nodemanager". Replace all the "\n" characters in the Yarn.cmd file to "\r\n".
11. Also replace the "\n" characters in the Hadoop.cmd file to "\r\n".

12. Yarn-site.xml change is as shown in the screenshot below:

13. Make changes to the core-site.xml as shown in the screenshot below:

14. Make the configuration changes as per the answer here :
15. Download Eclipse Helios for your Win OS to generate the jar's required for your map reduce applications. Use jdk1.7.0_71 and not the 1.8+ versions to compile your hadoop mapreduce programs.
16. Kickstart your Hadoop dfs and yarn and add data from any of your data sources and get ready to map reduce the heck out of it.... < A quick note,after formatting your named node it defaults to a tmp folder along with your machine name... in my case it is C:\tmp\hadoop-myPC\dfs\data>

Monday, December 21, 2015

Tableau Dashboards Published...

A few Tableau dashboards I have published off late to give a flair for different visualizations within Tableau:

Sunday, November 22, 2015

Cyclotron's Android App

Just created a Xamarin Android mobile application . Extremely easy to use and did not require much reading the resources to understand how to go about building it. The first iteration is as shown in the figure below:
Though the emulator (Nexus 5 through 7) did not render as clearly I wanted it to, but still a great start for v1.0. The next version would basically integrate with Google maps. As soon as one clicks the app, you get a splash screen and then navigate to Cyclotron's main menu from where you can navigate to the layouts. The support aspect would also be a part of the next iteration of the app along with the Login. There were a few more images added to the individual activities .

Probably leverage this article as the initial help on how to use the app.... The follow up items are as follows:
1. Integration with Google Maps
2. Synchronization with Cyclotron's support database
3. Login for Support
4. Tweak the UI
5. Replicate for IOS