Google Nature Language API

Here is a notes about the google nature language API summary I wrote  three years ago. some feature may change a bit, refer to here for latest:
Refer to here for basic concepts :



The Google Cloud Natural Language API provides natural language understanding technologies to developers, including:

  • sentiment analysis, – English
  • entity recognition, – English, Spanish, and Japanese – in fact is : nouns analysis
  • and syntax analysis – English, Spanish, and Japanese

API has three calls to do each, or can do them in one call, analyzeEntities, analyzeSentiment, annotateText.

sentiment analysis

return the value to denote the text emotion negative/positive related extent only now.
has the polarity and magnitude value as the result.

entity recognition

find out the “entity” in text – prominent named “things” such as famous individuals, landmarks, etc.
return with entities and there URL/wiki, etc.

syntactic analysis

two things returns from syntax:
1. return the sentences/subsentences of input text.
2. return the the tokens (words) and there meta in grammar syntax dependency tree.


Test Steps of commands ++++++++++++++++++++++++++++++++++

gcloud auth activate-service-account --key-file=/yourprojecytkeyfile.json

gcloud auth print-access-token

print-access-token, This will give you a token for following commands. I create three json files to test each feature:
so I can use these commands to try three API now:

curl -s -k -H "Content-Type: application/json" \
-H "Authorization: Bearer ya29.CjBWA2oWnup6dVvAlv6NTJyLsDtfqdCx70tX6_J0H7KFngd1ual2Osd8gCpcc" \ \
-d @entity-request.json

curl -s -k -H "Content-Type: application/json" \
-H "Authorization: Bearer ya29.CjBWA2op6dVv_T7nAlv6NTJyLsDtfqdCx70tX6_J0H7KFngd1ual2Osd8gCpcc" \ \
-d @syntactic-request.json

curl -s -k -H "Content-Type: application/json" \
-H "Authorization: Bearer ya29.CjBWA2oWnup6_T7nAlv6NTJyLsDtfqdCx70tX6_J0H7KFngd1ual2Osd8gCpcc" \ \
-d @3in1-request.json

About how to create these json file as input, please refer to google SDK doc.




Google Cloud Basic

I have been using some google clouds API features , like ASR, TTS with wavnet, translate, calendar, for some years. My experience basically is that they have a clear product definition and clear API. But they also have some products not so many users and have to fade out from market. Comparing with AWS, google cloud is more technical oriented and AWS is good at facing the commercial company.

Here is a summary for some basic things to use google cloud:

  1. First go to to create one account and make some basic set up.
  2. At the first, you can get some free volume , for some time for most API.
  3. Create the project in the account and download the project private key as json file. later you will use this key.
  4. Then you can start download the google cloud SDK, and this SDK may give you a lot commands of gcloud to do a lot works.
  5. Many google API features can be tested through the gcloud in the command line. Normally you need to get a token key through the Oauth2 and use this key to run command API.  Token will expire after some time.
    So from this command line you can do immediate test for API.
  6. If you use API client package, like java, python, js, you need to set env to refer to your project key for client package to run, like :
    export GOOGLE_APPLICATION_CREDENTIALS=”/opt/googleSDK/myProject-ffe45577ca.json” , like set to env for java .
    So client can run OK. Please refer to each API doc for how to use it.
  7. If your free trial period is end, then you just need to upgrade your account to GCP billing account with proper billing information. then it will starting charging. After you active your upgrade, you do no need to update your project key json, cloud side will auto know your project is validated.
  8. And at the manage console, you can see the API calling actives, calls per day, per minutes, etc. The manage console, looks like very complex, but in fact not too many things to manage over there, easy to use.





Free API management choices for distributed Restful Web Service

If you are looking for cheaper solution for the web service API management, this list might can help to give some hints.

As far you have distributed or micro web service, you will involve in the how to put all the shared feature into one place for each WS, such security, authentication and auditing. Here I list some free or low cost solution for the API lifecycle management framework and products.  When you use amazon AWS or Microsoft Azure cloud, they already supply similar service. For the open source way for small companies who do not want the bundle to a cloud provider, a self-controlled API management solution will give you more flexibility and independence for future migration.

Basically they have either on-premise or as-a-service model to go. With on-premise, means be deployed in either a physical or virtual data center. And as-a-service cloud version will let you integrate with cloud based API management and you can manage it from anywhere.

  1. WSO2 on-premises – WSO2 API Manager is a 100% open source enterprise-class solution that supports API publishing, lifecycle management, application development, access control, rate limiting and analytics in one cleanly integrated system. Running on java, most database and Apache ActiveMQ. Apache 2.0 license.
  2. Kong Community Edition  – open source with a lot customers base. Kong’s server, based on the widely adopted NGINX HTTP server, which is a reverse proxy processing your clients’ requests to your upstream services. Kong’s datastore, in which the configuration is stored to allow you to horizontally scale Kong nodes. Apache Cassandra and PostgreSQL be used.
  1. Tyk On-Premises – not open source, Single gateway Tyk Pro licences are FREE forever. Simply install your preferred package FREE for a single gateway, affordable at scale. No need to maintain forks, third party dependencies, or purchase SaaS add-ons. Just download the FREE Open Source API Gateway, get a FREE Tyk dashboard licence, use Tyk commercially… for FREE. No hidden costs, no restrictions on number of users, APIs or analytics retention. The entire platform, under your control, on your own servers. Priced according to the number of gateways in your environment, not the size of your team. Full access to the entire platform, no features held-back, starts from FREE!
  2. Apiman (not add new feature any more after redhat get the 3scale) – simple to use, install on wildfly ( Elasticsearch) or tomcat 8
  3. Api Umbleralla – looks like easy to use and install – require MongoDB and Elasticsearch
  4. – a very new API manager, has the gateway, manange API and portal three parts,   required MongoDB to store mandatory data and ElasticSearch (Apache License V2)
  5. Apigee – acquired by google, not free.
  6. Apiary free edition – with oracle, API Blueprint is simple and accessible to everybody involved in the API lifecycle. Its syntax is concise yet expressive. API Blueprint is completely open sourced under the MIT license. Its future is transparent and open. API Blueprint doesn’t need a closed work group. Instead it uses the RFC process similar to Rust language  or Django Enhancement Proposal RFC processes.

How to expand disk size of CentOS 7 for VSphere virtual machines

Here I list major steps to extend hard disk (add sda3 partition) from 40GB to 100GB at my CentOS 7 VM. I refer to some internet links to make it work but there also has a bit notice you need to note at some tricks part.

  1. Show current disk partition info and size :

df -h

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 38G 34G 3.8G 90% /
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 80K 2.0G 1% /dev/shm
tmpfs 2.0G 8.9M 2.0G 1% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/sda1 497M 246M 252M 50% /boot


sda 8:0 0 40G 0 disk 
├─sda1 8:1 0 500M 0 part /boot
└─sda2 8:2 0 39.5G 0 part 
 ├─centos-swap 253:0 0 2G 0 lvm [SWAP]
 └─centos-root 253:1 0 37.5G 0 lvm /
sr0 11:0 1 1024M 0 rom

sudo fdisk -l

Disk /dev/sda: 42.9 GB, 42949672960 bytes, 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0001af8a

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 83886079 41430016 8e Linux LVM

Disk /dev/mapper/centos-swap: 2164 MB, 2164260864 bytes, 4227072 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/mapper/centos-root: 40.3 GB, 40256929792 bytes, 78626816 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


 Static hostname: wayne-v-ctos7
 Icon name: computer-vm
 Chassis: vm
 Machine ID: da93c1884d894932aef5bd13121e7478
 Boot ID: 772e152a486c45e58feb119b02a4c5f7
 Operating System: CentOS Linux 7 (Core)
 CPE OS Name: cpe:/o:centos:centos:7
 Kernel: Linux 3.10.0-229.20.1.el7.x86_64
 Architecture: x86_64

shutdown -h now

2. Then shut down VM , back up (export) your VM, and delete all the snpshots of the VM, and then use vSphere Client 5.5 to extend the disk size from 40GB to 100GB. If you do not remove the snapshots, you can not extend disk size!

3. Then start the VM and create a new partition to use the new disk blocks:

sudo fdisk -l
sudo fdisk /dev/sda

Target in fdisk is to Create new primary partition 3(sda3), set type as linux lvm (8e).
n - to create new primary partition 3 use left disk blocks
3 .......
t - to change 3 partition to the 8e type
w - to save and quit

You need to understand why I use 8e here for LVM and these each option doing. use “m” for help inside the fdisk.
4. After reboot the VM, now I have the sda3 but not be used yet. I need to use it to extend the Volume Group:

df -h
sudo fdisk -l
sudo vgdisplay
sudo vgextend centos /dev/sda3
df -h
sudo vgdisplay
sudo lvextend -L+59999M /dev/centos/root 
sudo resize2fs /dev/centos/root  **** for centOS 5,6
sudo xfs_growfs /dev/centos/root  **** for CentOS 7
df -h

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 97G 35G 62G 36% /
devtmpfs 2.9G 0 2.9G 0% /dev
tmpfs 2.9G 92K 2.9G 1% /dev/shm
tmpfs 2.9G 8.8M 2.9G 1% /run
tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup
/dev/sda1 497M 246M 252M 50% /boot


*** About this  command :

sudo lvextend -L+59999M /dev/centos/root

– here I left 60GB but I only use 59.999GB as I need to left a bit space to let system can run it, do not use all space.
and how this /dev/centos/root is got?  centos is the “VG Name”  in the vgdisplay command,  and root is volume, if you see ”df -h“, you should see in this way
/dev/mapper/centos-root  (VolumeGroup-Volume)
So do not make any error at here.

About the each steps means and details, you can refer to these two links:






How you extract the diff from two text files in linux

Q: You have the two files file1.txt and file2.txt. Now what you want to get is the lines in file1, but not in file2, how to do this by command in linux?

A: If your files are sorted already in linux, then there is a command already has this ability to do:


 comm - compare two sorted files line by line

 comm [OPTION]... FILE1 FILE2


-1 suppress lines unique to FILE1

-2 suppress lines unique to FILE2

-3 suppress lines that appear in both files

So this command will solve your issue:

comm -2 -3 file1.txt file2.txt > fileResult.txt


But if your file is not sorted, how?

There is another way to solve this too, suppose all your lines are less than 400 chars.

diff -a --width=400 --suppress-common-lines -y file1.txt file2.txt > fileResult.txt


By this diff command, the result you get is not clear in fact, as it compensates each line with “…..(spaces)….<“, so you still need to find and search this kind of extras and remove them.

If you get a better way in diff way, please let me know. Thanks here.

About MySQL index max length


To fast the search speed such as sort, index is very important tool must to use(INDEX, PRIMARY KEY and  UNIQUE). By default (5.6 version) the InnoDB support the 767 bytes length (<256 UTF8 chars) for a index. But if your text is a bit longer  and you need to index them, what you should do?

You have to understand that two file format concepts of Antelope(default) and Barracuda(start from version 5.6 mySQL). Refer to the MySQL manual :

  • Antelope is the original InnoDB file format, which previously did not have a name. It supports COMPACT and REDUNDANT row formats for InnoDB tables and is the default file format in MySQL 5.6 to ensure maximum compatibility with earlier MySQL versions that do not support the Barracuda file format.

So you see that new file format can support up to 3027 bytes for index, which will improve the search speed a lot for long text. Here is the steps to change to make the large index works:

You can change the config in the my.cnf (/etc/mysql/my.cnf)


or you can dynamic change at the console:

SET GLOBAL innodb_file_format=Barracuda;
SET GLOBAL innodb_file_per_table=ON;     (default is on already)
SET GLOBAL innodb_large_prefix=1
All of those settings are captured as a table is created. (You could change existing tables using ALTER.)

About the
ROW_FORMAT=DYNAMIC;      ( or COMPRESSED  to save space but cost CPU)
ROW_FORMAT=DYNAMIC can be append to the end of CREATE or used in the ALTER TABLE.  MySQL WorkBench also support you this at alter table at “options” tab.

How to make the Linux guest screen display properly in Hyper-V


I have worked on the Linux guest on Hyper-V recently, to set a proper screen display is more troublesome than in VirtualBox. (Running Linux on Hyper-V is not a good idea as obviously running slowly, even MS support integration in most Linux kernel to run them. ).

Here is the notes for the CentOS and Ubuntu:

For CentOS and Red Hat


grubby --update-kernel=ALL --args="video=hyperv_fb:1024x768"


For Ubuntu

Install linux-image-extras (hyperv-drivers):

sudo apt-get install linux-image-extra-virtual

Open the Terminal and type:

sudo gedit /etc/default/grub

Find the line starting with GRUB_CMDLINE_LINUX_DEFAULT, and add


or your preferred resolution in between the quotes (The maximum possible resolution is 1920×1080) like this:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash video=hyperv_fb:1920x1080"

Save and Exit.


sudo update-grub

Restart Hyper-V (restarting Ubuntu (Linux) might be enough.


Improve the Mybatis performance


Some tips on improve the Mybatis performance.

  1. Set fetchSize to a considerable amount.
  2. Only retrieve needed columns from DB and map them to the Java object.
  3.  Try to reduce the nested selects in one user interaction, try to use more nested joins  in the SQL level in one time DB call.
  4.  Use cache integration to improve re-read speed.  Such as Memcached or  Ehcache
  5. SQL  and DB self optimization, such as index for sort and order



How to run Memcached on Mybatis


1. Install Memcache

To start, install memcached via apt-get. such as in the Ubuntu 12.04

sudo apt-get install memcached

It auto starts the memcached

ps -ax | grep memcac
21199 ?        Sl     0:00 /usr/bin/memcached -m 64 -p 11211 -u memcache -l

Refer to official Mybatis link:
Add the jar to your maven

 and then add the memcached to your mapper if you want which MyBatis mapper to use it.
<mapper namespace="org.acme.FooMapper">
  <cache type="org.mybatis.caches.memcached.MemcachedCache" />
Create a file and put to your class path, eg resources folder.
# any string identifier
# space separated list of ${host}:${port}
org.mybatis.caches.memcached.expiration = 600
org.mybatis.caches.memcached.asyncget = true
# the expiration time (in seconds)
org.mybatis.caches.memcached.timeout = 600
org.mybatis.caches.memcached.timeoutunit = java.util.concurrent.TimeUnit.SECONDS
# if true, objects will be GZIP compressed before putting them to Memcached
org.mybatis.caches.memcached.compression = false

In fact if you add multiple cache server at org.mybatis.caches.memcached.servers, it will has fail over ability among them. auto continue using live one if one die.

This intergration in fact based on the Spymemcached, is an asynchronous supported, single-threaded Memcached client. When you call any caching-related method on spymemcached’s MemcachedClient, it will be handled asynchronously. The client call method handles writing the details of the operation that should be performed into a queue and returning the control back to the client making the call. The actual interaction with the Memcached server, meanwhile, is handled by a separate thread that runs in the background.

My testing prove that it can improve the loading DB data at least 50% reading time if data has been in memcached. So this also proves that Mybatis self cache is not enough  big because it is not designed for cache only.


puppet- on windows common issues – 3

By W.ZH Sept 2013

C:\Windows\System32 access issue for puppet

This is a very trick issue and I must alert you here. As puppet is writen by ruby 32 bit. so when you try to run some test and program access to the C:\Windows\System32 folder to your agent system, you will get some strange error. Such as for my case, I want to enable the telnet client on windows, after is is enabled, telnet.exe file will show up under the C:\Windows\System32\telnet.exe. So I want to use “creates” to find it to avoid the dup install. But it always failed to find it. the reason is because on my 64 bit windows, ruby goes to the C:\Windows\SysWOW64 to look for it.

exec { ‘enable_telnetclient’:
command => “C:\\Windows\\System32\\pkgmgr.exe /iu:TelnetClient /quiet /norestart”,
provider => windows,
creates => “C:\\Windows\\
returns => [0,194],


So what is the proper way we can let a 32 bit program to access system32 folder under the 64 bit windows?  use SysNative,

proper code should be:

  exec { ‘enable_telnetclient’:
command => “C:\\Windows\\SysNative\\pkgmgr.exe /iu:TelnetClient /quiet /norestart”,
provider => windows,
creates => “C:\\Windows\\SysNative\\telnet.exe”,
returns => [0,194],

Why? please read this links. I do not need to copy paste to here.,_some_files_and_folders_shown_by_Windows_Explorer_are_not_shown_by_Total_Commander!#Affected_files_and_folders!topic/puppet-users/A_9i2Wx1yFI