Wednesday, October 5, 2016

Modifying chef resources after they're already in resource collection


First, lets fire up chef-shell to demonstrate by creating a basic resource

$ chef-shell  
chef (12.14.57)> recipe_mode
chef:recipe (12.14.57)> file 'testing_edit' do
chef:recipe > content 'words'
chef:recipe ?> end
 => <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">

Easy way to modify resource collection

Now, I am going to modify this resource using a NEW resource  edit_resource
chef:recipe (12.14.57)>
chef:recipe >
chef:recipe > edit_resource(:file, 'testing_edit') do
chef:recipe > content 'different words'
chef:recipe ?> end
 => <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "different words">
chef:recipe (12.14.57)>

Coolness (remove resource from collection):

edit_resource(:file,'testing') do
chef:recipe > action :nothing
chef:recipe ?> end
 => <file[testing] @name: "testing" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:nothing] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">

Tuesday, October 4, 2016

Setting up chef Automate / Workflow (aka: delivery) in completely air gapped environment - level 1

Manual delivery install in airgapped env (AWS in Oregon)

Creation of Air-gapped environment

Create DHCP option set

Create vpc 'alexv-manual automate in airgapped env'
  set DNS resolution to Yes
  set DNS Hostname to Yes
  set DHCP option set to one above

Create a Windows 'jump box' inside VPC
  network: vpc above
  subnet - create new
    VPC - vpc above
    AZ - no pref
    CIDR - same as vpc
  refresh vpc field and select subnet
  assign public IP - true
  Network - default
  Storage - default
   (make sure you have enough free space to store all of the binaries needed inside VPC)
   (i used 40 gigs) - this may mean that you have to expand default hard drive to occupy full HD space
    Name - alexv-jump box
    create new SG 'jump box'
    RDP - anywhere
    HTTP - anywhere
    HTTPS - anywhere
  Select your keypair
  ** On your local Mac inside RPD app, enable folder redirection when you add this box.
     set folder redirect to the location where your delivery.license file lives
  on the windows box, install filezilla - to make it easy to transfer files

Installing Delivery

Create Chef Server
  VPC - same as above
  auto assign public ip - false
  storage - change to 30
  tag - alexv-chef-server
    create new SG "chef server"
    open port 22
    open  All ICMP
  Keypair - select yours

Create Workflow server
  click on chef-server, select more like this
  VPC - same as above
  Subnet - internal subnet
  auto assign public ip - false
  storage - change to 30
  tag: alexv-Workflow-server
    create new SG "Workflow server"
    open  port 22
    open  All ICMP
    (maybe needed?) 9200 - due to elastic search get errors
    (maybe needed?) 5672 - due to another elastic search failure?
  Keypair - select yours

Create Windows (or *nux) build node
  network: vpc above
    VPC - vpc above
    AZ - no pref
    CIDR - same as vpc
  refresh vpc field and select subnet
  assign public IP - false
  Network - default
  Storage - default
    Name - alexv-windows build node
    create new "windos build node"
    open RDP - anywhere
    open All ICMP
    open 5984-5986 anywhere (for rdp)
  Select your keypair

  Internet Gateway:
    create internet gateway - alexv-air gapped
    attach to VPC (above)

    when you create VPC, it created a route table
      add -> point at internet gateway

Create 4 CentOS boxes to be environment nodes
  medium size
  HDD default
  SG - copy from workflow server
  Name SG "environment nodes"

Create 2 CentOS boxes to be build nodes
  medium size
  HDD - 15 gigs
  SG - copy from workflow server
  Name SG "build nodes"

Actually Install and Configure Automate

on the Chef Server and Automate node - follow directions
disable ipv6 in /etc/hosts
make sure they can ping each other
make sure they can resolve dns of each other
make sure they cant access internet

Jump Box (or workstation)
copy target os binaries into jump box: chef server, automate, push jobs server, chefdk, chef manage, supermarket if needed.
copy binaries to correct server /tmp folder
copy chefdk for use on workstation as a management node
setup user ssh auth
  ssh-keygen -t rsa -b 4096 -C ""

Chef Server
install chef server per directions
chef-server-ctl user-create alex alex 'alexalex' --filename /tmp/alex_user.pem
chef-server-ctl org-create alex_org 'Fourth Coffee, Inc.' --association_user alex --filename /tmp/alex_org-validator.pem

install push jobs per directions:
  sudo chef-server-ctl install opscode-push-jobs-server --path /tmp/opscode-push-jobs-server.x86_64.rpm

sudo chef-server-ctl user-create delivery delivery user 'alexalex' --filename /tmp/delivery_user_key.pem
sudo chef-server-ctl org-create automate_org 'org description'  --filename /tmp/automate_org-validator.pem -a delivery

Install manage: (optional)
sudo chef-server-ctl install chef-manage --path /tmp/chef-manage-2.4.3-1.el6.x86_64.rpm
reconfigure chef, push, manage

on the Delivery server
install delivery
setup command: sudo delivery-ctl setup \
                      --license /tmp/automate.license \
                      --fqdn ip-10-0-0-67.ec2.internal \
                      --key /tmp/chefserver/delivery_user_key.pem \
                      --server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org
copy all PEMs from chef server to delivery (validator, admin, delivery_user)
Enter name of your enterprise
  example: alex_ent
  (note: look for a bug here where enterprise is created, but admin creds are not displayed nor created in /etc/delivery/<enterprise-admin-credentials>)
  (if bugged) creat enterprise manually
    delivery-ctl create-enterprise alex_ent --ssh-pub-key-file=/etc/delivery/
Copy ChefDk binary to /tmp/chefdk-0.18.30-1.el6.x86_64.rpm
install build node
  sudo delivery-ctl install-build-node -I /tmp/chefdk-0.18.30-1.el6.x86_64.rpm -f -u chef -P chef

Verify build node works with `knife node status`
  this will query push jobs server for status of each node (different from knife status)
  available means push jobs can communicate with the node (you will know that at least push jobs is running at this point)
Verify you can fire off a push job:
  knife job start chef-client --search '*:*'

create user (via UI or CLI)

add public ssh key from workstations `ssh-keygen` step to the user
  delivery ui -> user -> ssh pub key

Jump Box (or workstation)
Install chefdk
configure knife.rb with delivery key for communication with chef server
  node_name            'delivery'
  chef_server_url       "https://ip-10-0-0-80.ec2.internal/organizations/automate_org"
  client_key           'C:\Users\chef\.chef\delivery.pem'
  trusted_certs_dir    'C:\Users\chef\.chef\trusted_certs'
  # analytics_server_url 'https://cad-chef-server/organizations/cad'
  cookbook_path 'C:\Users\chef\chef-demo\cookbooks'

fetch certs if needed
  knife ssl fetch

verify knife works
 knife node list
 (or from delivery server)
  knife node list -k /etc/delivery/delivery.pem -u delivery --server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org

Pull down all of the cookbook dependencies to be used in air-gapped env (i do it via berks)
  mkdir repo
  cd repo
  chef generate cookbook staging (this will be the first test cookbook)
  modify metadata.rb of seeding cookbook to include:
    depends 'delivery-truck'
    depends 'push-jobs'
    depends 'build_cookbook'
    depends 'delivery_build'
  mkdir seeding
  cd staging\.delivery\build_cookbook
  run `berks vendor ..\..\..\seeding` to pull down all dependencies into a local folder

upload necessary cookbooks up to chef server
  knife cookbook upload -o seeding -a
  (or alternatively `knife cookbook upload delivery-truck --include-dep -o seeding`

test ssh auth to delivery box
  ssh -l alex@alex_ent -p 8989 ip-10-0-0-67.ec2.internal

Configure delivery cmd - C:\Users\chef\cookbooks\staging\.delivery\cli.toml
  in root of staging cookbook$ delivery setup -e alex_ent -o automate_org -s
 ip-10-0-0-67.ec2.internal -u alex

make sure you can interact with delivery via delivery cli:
  Verify API works
    delivery api get users
    delivery api get orgs
  verify you can create a project
    create a cookbook
    `delivery init` inside that cookbook

First pipeline
i'll use staging cookbook as it's a nice example
initialize delivery pipeline
  inside staging cookbook run `delivery init`
bump metadata.rb if needed
modify config.json to exclude spec and test folders due to foodcritic testing them, leading to workflow epic failing on linting phase.
  $ cat config.json
        "version": "2",
        "build_cookbook": {
          "name": "build_cookbook",
          "path": ".delivery/build_cookbook"
          "lint": {
            "foodcritic": {
              "excludes" : ["spec","test"]
        "skip_phases": [],
        "build_nodes": {},
        "dependencies": []
change Berksfile (of build cookbook)
  Since you're not connected to internetz, you'll fail all phases of workflow due to Berksfile
  change source to :chef_server
    $ cat Berksfile
      source :chef_server
      # or your internal supermarket

      group :delivery do
        cookbook 'delivery_build'#, chef_api: :config
        cookbook 'delivery-base'#, chef_api: :config
        cookbook 'test', path: './test/fixtures/cookbooks/test'

      # group :delivery do
      #   cookbook 'delivery_build', git: ''
      #   cookbook 'delivery-base', git: ''
      #   cookbook 'test', path: './test/fixtures/cookbooks/test'
      # end

add and commit changes
  git add -u
  git commit -m 'very descriptive comment'
  delivery review

Bill of Materials:
Filezilla (windows) - management node
  note: seems like chefdk 17.17 doesnt work in isolated environment with a Yajl error
chefdk-18.30 for windows
chef manage rpm
supermarket rpm
berks vendor of `build cookbook`
  should include all of the following:


*) The setup command *may* create an enterprise for you. If you see that behavior, and do not get credentials as an output, you will have to delete the enterprise, and create it again using create-enterprise command.

*) node create command installs push jobs via this script:
..which is called by this script:

Thursday, August 11, 2016

Getting started with DevOps - Basic Chef (and any other CI/CD environment)

A got a question a few weeks back, and I think the answer is worth sharing.

  • Do we need to have a chef workstation hosted in our cloud environment that everyone logs in to (something like a jump box configured with chefdk and all the plugins), or can users spin up VMs off their own machine and that is used as the Chef Workstation? 
  • If spinning up VMs off our machine, how do we connect to chef-repo which I’m setting up in AWS? 
  • We need to connect to git for source control. I have set up an enterprise git instance - how do I change the cookbooks to connect to our instance of git?

This is going to be a lot of words, because there is no easy answer...
What one starting a similar journey should do though, is take the below suggestions, and run through them iteratively. Version 1, ver 2, 3...etc. Don't try to do everything at once.

Also, is a fantastic tool to figure out where you are in the DevOps journey, and where others typically go.


Source control will be your base, so it's first in the list.

For git, there are a lot of articles about "forking the code" and the eventual price of having done that. So, when it comes to community cookbooks, best course of action is - don't fork community cookbooks (or at the very least don't fork it for very long).

If you rely on a community cookbook, take it, along with the full git history, and upload it into your private git and your chef server. Any changes to community cookbooks should be done via `wrapper cookbook` (mycompany_apache for apache, etc..).

If some feature is not supported (or you found a bug), make a change and ASAP push the change back to the community via PR - this is so you don't have to maintain the fork, and can take advantage of improvements in the public version. (example: you're using apache 6 with your private hotfix, apache 7 comes out, and it's drastically different. Due to your custom changes, you need to spend 20 hours merging the versions and applying any custom hotfixes you've accumulated. You further fork the code to make it work. You spend all your days fixing bugs. Your head hurts from drinking too much coffee....)

Chef repo:

So, a good pattern is to have a git repo called chef_base or mycompany_base, etc..
Usually, a user starts by cloning the this repo locally. Any updates that would affect the whole company would be pushed back to git, so every user can benefit from it.

In the chef-base you'll have your environments folder with various environemnts, roles with roles, and a folder called cookbooks (execute `chef generate repo test` for a basic example). You would have chefignore and .gitignore filled out as per your org rules. You would then do something about .chef folder - either have a symlink that points to a known location, use a known variable to load the file, or leave it up the the user to fill in the details. Typically the .kitchen or vagrant file to stand up a cloud workstation would live here (more on this later). The \cookbooks folder is either empty or has a global cookbook like chef_workstation in there.

So, when a user starts working with chef, they clone chef_base and have everything they need to get started. All they do, is go into \cookbooks folder, and git clone the thing they want to work on.  This keeps chef-repo and each cookbook they work on completely independent. If they want to add new community cookbooks to your org, they follow the same process as above: clone community cookbook into /cookbooks and push it internally.

Chef workstation:

So, you definitely want to have each user have their own cloud workstation. Also, they should have the ability to create/destroy them whenever they need to. On average, workstations don't survive longer than a week (not should they).

Locally is pretty easy. Use test kitchen, or vagrant, mount the local chef-repo folder inside the VM and you're done.
(Here is how kitchen would work:
(here is how vagrant would work:

With Amazon/Azure/Aws/Vmware/etc.. mounting a folder is done differently. In this scenario, when users run Test Kitchen, it would create additional VMs in the cloud. You'll need to setup a sync mechanism if your virtualization platform doesn't support mounting local folder on a VM. You could give users an EBS volume they could share across workstation and local dev machine. Or just a regular network share they can mount locally and on a workstation.  Also, I heard works really well, however I never touched it personally.

Key takeaway here is that lots of companies are going down the VM road.

The long version is that this decision will be guided by a couple of factors - how powerful your users workstations are, your companies business direction, how much money your DevOps initiative has been given.

What I've seen in the wild is very interesting. The best and the worst shops use nearly identical workstation hardware. On the one end of the bell curve, there are companies where employees have 2 gig laptops incapable of opening notepad in under 30 seconds. All work is done locally, and is very painful. On the other end, you also have 2 gig laptops - though typically surface and mac book minis - however, in this case, all of the work is done in the cloud, and these machines are plenty powerful since they are simply used to RDP into remote resources (and used for skype and facebook the rest of the time).

Hope that helps.

Monday, April 25, 2016

Resetting opscode-reporting password

One in a while upgrading opscode-reporting goes wrong. Or it doesnt start, you do manual clean up, and basically passwords go out of sync.

Solution is pretty straight forward - reset the passwords to what the system things the passwords should be.

Open up /etc/opscode-reporting/opscode-reporting-secrets.json
Grab opscode_reporting & opscode_reporting_ro passwords and pipe them to opscode-pgsql

echo "ALTER USER opscode_reporting PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'
echo "ALTER USER opscode_reporting_ro PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'

You should get the result "ALTER ROLE" from each of the 'echo' commands

Next, make sure rabbitmq password is in sync:
In the same .json file, in the "opscode_reporting" section, grab the "rabbitmq_password" and use it in place of XXXXX

PATH=/opt/opscode/embedded/bin:$PATH rabbitmqctl change_password runs XXXXX

then chef-server-ctl restart opscode-reporting

And finally, you might still be broken.
If you look at the process list and see an error similar to below, send the HUP to svlogd to reload the configs.

root      1456  0.0  0.0   4092   196 ?        Ss    2015   3:12 runsvdir -P /opt/opscode/service log: vlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?

So grab the correct pid by running chef-server-ctl status

run: opscode-reporting: (pid 17407) 30088s; run: log: (pid 32415) 88051s

kill -HUP 32415

Sunday, April 3, 2016

Chef - Passing output of one resource as input to the next

There are a couple of ways to do that.

One is via lazy

directory '/tmp' 
file '/tmp/alex.txt' do
  content 'sudo make me a sandwitch'
ruby_block "something" do
    block do
      command = 'cat /tmp/alex.txt'
      command_out = shell_out(command)
      node.set['a'] = command_out.stdout
    action :create
file '/tmp/alex2.txt' do
  action :create
  owner 'root'
  group 'root'
  mode '0644'
  content lazy { node['a'] }

Thursday, March 31, 2016

Powershell and chef - how to view the script powershell_script resource is executing

So, I was troubleshooting passing arrays to powershell_script resource.

Troubleshooting powershell_script

First - the easy way. Just run chef-client -l debug. In debug logging, you can see the whole script, which might be enough.

What makes troubleshooting powershell_script difficult, is the way it works from inside chef. A temporary file is created, and immediately nuked after execution, making it somewhat difficult to see exactly what happened.

After some messing around, I realized a simple trick:
powershell_script 'script_name' do
  code <<-EOH
    copy $MyInvocation.ScriptName c:/chef_powershell_staging_file.ps1  EOH

Passing array node attributes to powershell_script:

Seems that in defining a generic array, ruby inserts square brackets [ ] which actually become part of the string when consumed by powershell_script, and powershell chokes on it.
default['test_cookbook']['array'] = 'string1','string2'default['test_cookbook']['array'] = %w(string1,string2)
In both of the above, Powershell will either throw an error or generally not work as expected
Missing type name after '[' 
What actually happens, is during resource declaration phase, the square brackets get escaped (you can see it via chef-shell by creating a simple bash or powershell_script resource)

chef:attributes (12.8.1)> default['test_cookbook']['array'] = 'string1','string2'=> ["string1", "string2"]
for example bash:
chef:recipe >bash 'some-bash' do
chef:recipe > code <<-EOH
chef:recipe"> $Array = #{node['test_cookbook']['array']}
chef:recipe"> EOH
chef:recipe ?> end
=> <bash[Set-VariableArray] .... @code: " $Array = [\"string1\", \"string2\"] \n" ... 
using native ruby:
default['a']['b'] = %w(a,b,c)
keeping the recipe the same, the resulting code will be:
... @code: " $Array = [\"a,b,c\"] \n" ... 

Solution - simple in retrospect - double quotes:
node.default['a'] = "'value1', 'value2', 'value3'"
In your recipe, you'll get an actual powershell array:

powershell_script 'script_name' do
  code <<-EOH
    Out-File -InputObject "#{node['a']}".GetType() c:/out.txt

Tuesday, February 16, 2016

Governments entering IT at glacier speeds.

What happens when you give powerful tools to people with low motivation?

They create products that make it attractive to go back to filling out stacks of paper forms by hand.

Problem of the day:
I was filling out a Visa application for entering Japan. On the plus side, the file is a PDF and I can actually type the information into them. All of the fields are present. That's another plus.

The minus, is that input validation is broken on some fields. For example, of the 5 phone number entry fields, only 3 allow dashes. One of the date entry fields does not actually allow entry. Another date entry only allows 3 digits for the year.

But that's actually not bad.
What's bad, is that I could not print the thing. There was something literally not allowing me to print my own document.

This absolutely blew my mind. I attempted to print a few more times in complete disbelief for what was happening, before accepting the yak shave ahead of me as one of my own. First the obvious, was Ctrl+P different from File -> Print? Sadly, same result. (if you recall, Chrome takes over your print functionality)

So, can I just turn it off somewhere? Yes! Edit -> Preferences (it even has a hotkey!!)

And that is the Story of how I printed a form, that was likely butchered due to some government compliance rules on PDF security by people who were not given any autonomy and probably no explanation.


Wednesday, November 25, 2015

INNACCESSIBLE_BOOT_DEVICE - aka: how much I hate intel rapid raid

One beer and 2 episodes of DBZ Super, I am finally back to a functional Windows 10 machine. Also known as wasting two hours by beating your head against a useless recovery capabilities of Windows 10 and Intels incompetence at writing drivers.

GA-EX58-DS4 with 4 drives in Raid 5.

Upgraded from Windows 7 to Windows 10, and because I am an idiot, upgraded Intel Rapid Raid driver from 14.5 to 14.6

After the reboot I got the :( INNACCESSIBLE_BOOT_DEVICE . Instantly the memories of how much I hate upgrading intel drivers rushed back into my head, and I remember every time I swore never to upgrade their piece of %#@ drivers again...

Most of what I did I borrowed from

1: Boot into Windows 10 recovery mode. Advanced -> very Advanced -> Command prompt
2: Go to <system drive>\Windows\System32\drivers  (example: C:\Windows\System32\drivers)
3: Make a backup of existing driver files: ias*
3a: mkdir bad_intel
3b: copy ias*.*   bad_intel
4: get a piece of paper and a pen.
5: go to C:\Windows\System32\DriverStore\FileRepository and search for all folders with older version of a driver in it (dir /s iastora.sys)
6: write down a couple of characters to uniquely identify each folder. (example: iastorac.inf_amd64_47ebd65d436e75d0 - take the _47)
In my case I had 5 folders..
7: Start the recovery process:
7a: Take a look at timestamps iaStorA.sys in all of the FileRepository folders you got from search.
7b. Copy over the newest one over to C:\Windows\System32\drivers
7c. Exit command prompt (literally type exit)
7d. Click the button to continue booting to windows 10 and cross your fingers.
7e. If it doesn't work, go back to start and repeat with the next file. This is why you have paper, so you dont forget where you are in the process)

This process worked for me on 3rd file, which was from a few months ago.

Good luck, and if this works, have a beer in the honor of

useful links:


Wednesday, October 21, 2015

Provisioning windows box with Chef-provisioning on azure from a mac

After spending about half a day trying to get vagrant-azure to work it became very clear, that as of this writing the driver is just not mature enough. It works pretty good for Ubuntu/Linux but the moment you try to provision windows boxes, it sets your laptop on fire.

Instead of wasting any more time on it, I decided to give v1 and v2 provisioning drivers a chance, followed by Test Kitchen. IIRC they all use different drivers, and while all are pretty solid at provisioning Linux boxes, support for WinRM is very spotty.


First challenge is to authenticate successfully via provisioning driver. While Vagrant accepts subscription id and path to .pem as parameters, provisioning needs a azureProfile.json.

To get that file generated, I installed azure-cli via brew `brew cask install azure-cli`

Next, import azure creds with `azure account import ../../Projects/Azure/myazure.publishsettings`
This command will generate the missing azureProfile.json in ~/.azure

Next, validate it works with `azure account list`

Chef-Provisioning piece:

Get a name of the box (ami) you'll be using: `azure vm image list | grep -i Win2012`

Next, hack up the simplest recipe that'll spin up a box:

`knife cookbook create azure_old`
content of recipe/default.rb:

require 'chef/provisioning/azure_driver'
with_driver 'azure'
machine_options = {
    :bootstrap_options => {
      :cloud_service_name => 'alexvinyar', #required
      :storage_account_name => 'alexvinyar', #required
      :vm_size => "Standard_D1", #required
      :location => 'West US', #required
      :tcp_endpoints => '80:80' #optional
    :image_id => 'b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_2-LTS-amd64-server-20150706-en-us-30GB', #required
    # :image_id => '', #next step
    # Until SSH keys are supported (soon)
    :password => 'Not**RealPass' #required
machine 'toad' do
  machine_options machine_options
  ohai_hints 'azure' => { 'a22' => 'b33' }
Finally, run chef-zero (chef client in local mode): `chef-client -z -r azure_old`

If the above recipe fails, dont fail. Check the output, and see if it gets past the authentication piece. If it does, it's just a matter of getting chef-provisioning syntax correct.

Once the run finishes (Azure is slow) connect to the box with `ssh root@` for centos or ubuntu@ip for ubuntu boxes.

Now the Windows piece

With the `azure vm image list | grep -i Win2012` command I got a list of boxes, and once the test run with ubuntu succeeds, I move on to Windows.

This is where I took a break and had a beer. But I published this post anyway because I'll finish it eventually.

Useful links:

chef-base repo and workstation cookbook

A "chef-base" or "chef-repo" is a git repository which maps 1:1 to Chef organization hosted on the Chef server.  An organization in Chef server 12 is analogous to a single Chef server. Each of these "chef-base" Git repositories becomes the system of record for the global Chef objects (Environments, Roles, Data Bags) in a given organization.  This Git repository typically* does not contain cookbooks.

To setup chef-base a user should first create an empty git repository on VSO / GitHub / GitLab / etc..
It makes things slightly easier if none of the files are initialized, including readme and gitignore.

Next, user should execute "chef generate repo <name of github repo>" command. This will generate the skeleton for the repo.
The resulting skeleton folder should be pushed it its entirety to git repo.

Workstation cookbook

* One exception to not having cookbooks in chef-base is the workstation cookbook. 
The workstation cookbook is a shared cookbook for anyone using chef in an organization and provides a standardized way to work with chef. It also allows rapid on-boarding of new team members and ability to safely experiment with a new tools. 
It works well in Vagrant, but there is a major limitation, you can't run Test Kitchen inside a Vagrant. For best results, encourage teams to leverage internal or external cloud VM, where kitchen runs will create additional VMs in the same cloud.
A Vagrantfile can be placed in the root of the cookbook. This vagrant file has a couple of purposes:
  • responsible for creating / destroying the workstation VM
  • kicking off chef-client run
  • easy access into the box via vagrant login
  • mounting the local chef-base as a folder in a VM
.gitignore file should be modified to exclude all cookbooks with exception of the workstation cookbook.

Places to learn more:
<add yours here> or in the comments.

Saturday, October 3, 2015

Random observations of a new publically facing Chef website.

First time using – musings and observations

I hope I won't hurt anyone's feelings by below, below is what I see as an engineer. Every time I see similar pages, I make a conscious choice to overlook these defects, it could be because I trust the site, or because I found the thing I need.

There is no way in hell I would know how to write an existing page, or actually implement the changes I noted. But what I find most fascinating about my job, is there is a guy somewhere in the company – every company - who knows exactly what comma to change to address the issue. If I were a business, I would seek these guys out, and reward them with titles, work from home schedules, “work on your own problem”, etc... It's just so un-economic and un-business like to lose them.

To business:

The experience has been an exercise in patience, but only due to an unfortunate coincidence of API incompatibility:

                The GeekWire even was announced using the Seattle address which excluded ZipCode:
                "Oct. 1-2, 2015, Sheraton Seattle, 1400 Sixth Ave."
                ( URL: )

Executive Summary: Overall Conclusion:
This experience instantly demonstrated the inferiority of this form of entry, as compared to the auto context/syntax entry offered by modern companies. If this is an internally developed tool for anything other than a personal project, it should be replaced with a real tool meant for the job.

Error 1:
The speakr input fields request ZipCode as a mandatory field.

Result 1:
I had to visit google maps, enter the partial address to get the ZipCode to unblock myself.
Pretty sure my mom would get past this now.

As @echohack says - Default matter. There is non-primary field that requests event start time. The defaults of the all 4 fields are set to 23:00. Meaning the entries are valid data type, but values for start date are totally off.

I think an 8am is a nice default for "start time" on "start date".

Possible scenario: a study of booking data found that most people fly in a day before, and they actually do want the start time to be 11pm for previous day for a networked dinner.
After digesting things over, above doesn’t make sense, because this isn’t an expense system. An event system should specify actual start time.

Result 2:
Had to make a couple of extra clicks to change the start time.

Error 3:
On initial event creation webpage threw errors: "Invalid start date", "Invalid end date". Clicking on start/end date fields again and resubmitting the form resulted in successful creation message.

Result / Assumptions
The drop off rate here is probably very high. I actually almost gave up here.
I wonder if there is monitoring or metrics in place to see this kind of drop off. Unlikely, but I do wonder if there is an easy to implement “business flow” monitoring solution for that like Zabbix.

Personal research todo: I wonder if paid version of google analytics is significantly faster at page load times than free one.

Error 4:
Allowed creation of events which have already occurred.

Possible scenario:
Could be a feature too I guess.

Might be a good idea to check if there is an anti-spam mechanism on event creation button.
Wonder if vanilla code coverage would pick something like this up, or if you need something like Fortify.

Error 5:
After successful event creation, that event would not show up in search results on
Possible causes is the refresh job on events jobs is not triggered instantly, the page is not yet hooked up to events, past events are ignored as a result of a conscious choice (possibly even from business), or something else entirely.

Overall Conclusion:
This experience instantly demonstrated the inferiority of this form of entry, as compared to the auto context/syntax detection offered by modern companies. If this is an internally developed tool for anything other than a personal project, it should be replaced with a real tool meant for the job.

Monday, September 21, 2015

Continuous key rotation with Chef

Lets see if I can get this down on paper in a meaningful way.. Players: a) some server (has to be Chef Server) - aka: Key Master. b) the rest of the infrastructure Tools needed: a) chef vault b) admin key for the Key Master c) sublime text The flow: Key master converges a recipe that does a global search for all of the nodes. For each node it generates a new key pair. It rotates the key and places the new key into a vault with search criteria of only itself and the node. Each node on converge accesses the vault and retrieves a new key. Marks the vault as converged or deletes the vault after consumption. Faults: What happens if the node doesn't converge for a long time? How does key rotation actually work? Can a node even converge if the key has been rotated? >> probably this is the way << Perhaps the node has to generate a key and set the search criteria to itself and Key Master. Key Master consumes the key and runs ctl command. Do Nodes continue to fail converges until Key Master updated the key? How does key rotation actually work? Result: Ever converge the node rotates its own key. Same model can probably be done for SSH keys. Final thought(s): What does it actually buy? I don't know, but many customers ask about it. Should it be done? Should each node have a unique, individual vault? Most likely, if you really think about it, there isn't a reason. Node's should be grouped and each group should run off a same vault. Having 1 vault per node with identical info is meaningless. Especially, if there is an admin who has access to all of the nodes anyway.

Friday, August 14, 2015

Pulling out Pega space requirements out of prpcUtils.xml

The idea is to decouple Pega managed file from your own automation. If your pega team decides to makes changes, you don't want to own these changes, but you do want the deployments to continue running regardless of who made the change. In this particular case, it's the space requirements for deployment.

Step one is to parse the XML and pull out the space requiremnt (via ruby 'cause we're running deployments via Chef)
Step two is to use that value in some meaningful way.

Saturday, July 11, 2015

Weblogic + Chef + Automation in General - thoughts and reflections

This is a brain dump written on an airplane in a rather sleep deprived state. Since alcohol is not free on domestic flights, I opted out for coffee and pounded away on the keyboard for a few minutes before legroom and awkward sleeping positions stopped being an issue.

To start, I will say that I am not an expert on WebLogic, which means I am not burdened by years of learning and prefect the art of managing it. I have come to learn it on Monday, and today is my first Friday after having being exposed to this technology.

So far I hear that my approach to managing WebLogic would work perfectly with Jboss, JVM, or Tomcat… but would absolutely not work here.

What I do know so far is that WL has a central API that has ability to manage the entire cluster of boxes. It also has the ablity to act as a load balaner, as well as the source of information and a central registry.

That last point is very powerful, and from what I have seen so far the most underutilized aspect of WL. Everyone is interested in the centralized functionality completely ignoring the ability to decentralize it.

Lets break it down.

The common approach that I have seen "sold" so far, is to run all commands from the Admin (central) server. The central server will take care of all distribution of packages, starting and stopping of the cluster and all other deployment related functionality. Great. But just how useful is that in actuality?

WL allows you to have a Domain which is distributed across multiple physical machines. Domain can have multiple clusters distributed across multiple physical machines. Each cluster can have multiple applications installed. Which means, a physical machine can have a whole bunch of MS (JVM) running on it, each belonging to a different domain and a different app.

That's a lot of moving pieces. So when we try to automate a system like that, we will never talk about ONLY WL, we will also talk about patches, modifying property files, modifying port numbers on the host, auto scaling up and down, and a holst of other admin functions which have nothing to do with WL, but which have to account for the fact that WL is distributed across.

After a week of disecting WL, I presently belive that the admin server is a fantastic service discovery tool for WL management.

With regarding to managing WL with Chef:
Chef-client runs on each physical server.
Physical servers are groups by environemnts - prod / dev / test
Each servers run_list includes applications which are running on that server or in that environment.
The recipe for that application pulls information for the current environment from some construct, like attribute or databag or environment where the node resides.
That info includes :
the admin server for a particular app that recipe is responsible for.
Application version number

The recipe has a set of actions (LWRP) - deploy, undeploy, start, stop, etc..
Each chef-client run executes independently on each of the physical server
The Runlist is either a list of applications on that server, or an LWRP with a list of admin servers and application data bags
Each LWRP or Recipe hits the API and finds out what MS are running locally on the host where recipe is being executed.
Each recipe executes an LWRP which hits the API and performs the needed stop commands (stop, query, etc..)
Chef client at the machine level Pulls down the EAR file locally and tells the API location of the EAR. This ensures that EAR is physically located on the host and is accessible to the MS
It then starts the MS via the API
It then does whatever local changes need to be performed on the MS - server level config - thereby ensuring that all of the changes are in fact done, idempotent, and consistent.

If the API / MS allows multi version support, this style of deployments would allow zero-down-time deploys.
If the WebLogic API is not built to support a lot of API calls, perhaps there is a way to optimize the MS or load balance multiple Admin servers - however, if having lets say 40 physical servers made 20 api calls over a course of a 15 minute deployment is too much for the Admin server to handle, perhaps it's a good time to look at automating WebLogic out of the company

Wednesday, February 25, 2015

Single chef-client run with multiple reboots on Windows

To teach is to learn...

...or something along these lines. "How do I manage reboots with chef-client on Windows" is a question I hear every so often. 

So, this time around, I decided to buckle down and write down as many ways as I could remember to reboot a server and continue a chef-client run. No mucking around with the run_list, or messing around with multiple run_lists, definitely no manual steps, and most definitely no knife exec.

Here is my brain child - input and feedback are most welcome!

In my experience I found a couple of common situations where Windows needs to be defibrillated -
  • something has been installed and reboot is needed
  • a bunch of somethings have been installed and reboot is needed
  • something needs to be installed and a reboot is pending
  • a series of somethings needs to be installed and they have various reboot state requirements
  • a week has passed since a reboot has been performed
  • server joined a domain

With Chef managing your infrastructure there is a new reboot scenario:
  • reboot immediately without aborting a chef-client run

The patterns in the Github repo allow users to manage reboots at the resource level, or as a wrapper cookbook pattern.

A real example can be seen in pattern two - which was really the genesis for this repo from way back when -

Patterns with cats:

Tuesday, November 11, 2014

The 5-minute guide to getting Vagrant up and running (Windows .box edition)

This blog post came about because I had an emergency need a fresh Windows box on the Macbook I'm using. took me a few seconds to get up and running, so why 5 min? Because I already had vagrant installed and .box file available.

It will really take you about 35 seconds to get up and running, but you still need to download and install some stuff.

Minute 1 - Download and install Vagrant

Minute 2 - Download and Install Virtual Box

Minute 3-4 - Download a Windows .box file

We're going to use Windows 7 Enterprise x86 - others are available here:

  • Open Command line
  • Execute: vagrant box add alex_rocks

Minute 5 - Get vagrant box up and running

This is the only step I had to do to get up and running
  • Open command line and create a new folder (example - C:\vagrant_test)
  • Go to that folder and type vagrant init
  • Edit the Vagrantfile and paste the following gist into it:
  • Save, Exit
  • execute: Vagrant up

Minute 6 - Do whatever you need to do with the Windows you now have. 

...what's a blog post with no pictures?

Monday, November 10, 2014

Maintaining access to PC while away from home (plus nuclear backup option)

This is fairly straight forward and takes roughly 30 to 45 minutes to configure. If you hadn't done it before, I'd say it'll take about 2 hours.

You are going to Abu Dhabi, but you still want to access your home (windows) PC.


a) Create backup user (just in case)
b) Enable remote desktop (RDP).
c) Install and configure DNS client to have a stable 'dial-home' address.
d) Configure WOL (wake on LAN) on PC.
e) Configure router to allow remote connections and WOL
f) Configure Scheduled task
g) Configure Power Management
h) Test (also make sure you have whatever notes you need).
z) Plan Z

A) Create backup user

Right click on My Computer, select manage, select users.
There will be a link somewhere to create a user, or just click on a blank space below existing users.
Select New user.
Enter username - something not super obvious like "backup". Instead, make it 'remote' or 'alex2'
Give it a password, check box for 'password never expires', check box for 'user cant change password'. Go to groups, and add 'administrator' group to the list. Click OK.
If you are prompted to enable Firewall rules to allow remote user, click 'yes'
If your antivirus says something, read it, and select whatever options would allow connections.

B) Enable remote desktop (RDP)

Right click on My Computer, select properties, select remote tab, check the box to enable remote access.
Click users button, and add the 'remote' user you created in Step A.

C) Install and configure DNS client to have a stable 'dial-home' address.

Since your ISP can change your IP address (and will change it almost daily) you need a way to find your computer. This is why we install DNS client. It will update your IP every time your computer starts, and map it to a public DNS address which you will use from Zimbabwe or wherever.

My all time favorite app 'DynDNS' is now a paid app, so it instantly seised to be my favorite app.
I've tried and it seems to work really well. It also allows you google signin, which is nice /borat.

Install - go to, sing in, and install the client.
Pick a name and enter it in a white box next to "domains". This will be the consistent 'stable' DNS name that you will use to connect to your home PC.
Now, go ahead and start the DuckDNS if it's not already started. It will show up in your task bar (bottom right).
Right click on it, and pick settings.
Enter the DNS name and token 
- DNS you chose (may be the whole name, I am writing this from memory) - 
- Token is available on the top of the webpage after you login.
Make sure you can click OK and things are green, and update works.

Testing via ping wont work, DuckDNS guys protect you from discovery, which is also very nice /borat. 

D) Configure WOL (wake on LAN) on PC and Wake on Timer.

Probably your PC is not configured for Wake on LAN. Most are not.
You will also enable a 'backup' plan to make sure your PC is awake during certain hours.
You will have to reboot the PC, go into BIOS, enable it and save the change.

WOL settings
Reboot PC.
Press DEL key or F2 or F10 like 7000 times. Start mashing the button as soon as you hear a *beep*.
Once inside look either in Power Settings or Network, or Advanced for "WOL" or "Wake-on-Lan" or "Magic packet".
Enable it.
If you have an option for password, enable it as well, and set some easy password, like 'sparky'.
Write the password down.

Setting Wake on Timer
While looking for WOL you may have seen 'Alarm' or 'Timer' or 'Wake on Event' or 'Schedule' or 'Wake on Schedule' or any other permutation of words indication that something will happen at a particular point in time or on event.

Once you find it - Probably under Power Management (might be under advanced).
Set the Wake up timer to around 8am of the timezone where you'll be at.

E) Configure router to allow remote connections and WOL 

Once Windows boots back up. Go to your router config page.
If you hadn't changed any of the default settings on your router, you most likely have a sticker with all of the info on the back of the router. If you have changed defaults, you probably know what you're doing.

Create 2 rules - one for RDP, another for WOL

Go to config page - or something like that
Go to port forwarding page, or Advanced section and then port forwarding.
You should see a button to create a new rule.
Create a rule -
- Name it something descriptive like 'home pc RDP'.
- under trigger or incoming port (usually this is either on the left side or on top of the new rule box) enter some random 5 digit number below 65000. (example 49381 - write it down)
-- we pick a random incoming port instead of mapping 3389 to 3389 to create a tiny bit of security against random port scans. Call it security through obscurity. Every little bit helps.
- for destination port enter 3389
- for destination pc you may either have a drop-down or a list of computers. Best way is to pick a MAC address if you have it. Otherwise Pick a name. If neither of these is available, pick an IP.
-- IP is generally a bad idea, because your router can reassign IP address to your PC if you lose power or if it decides it's a good idea. If you are feeling comfortable in this arena, dig through the settings and see if you can assign a static IP to your computer somewhere to keep this from happening.
Click Save.

Your router *may* have a specific setting for WOL but I rarely seen this. If it does, enable it, and you're golden.
Using the steps above, you will create 2 additional rules for WOL  - one for port 7, another for port 9.
I think there is a much lesser need to remap ports here, but you still can.

Click add rule
Name - WOL_7
Source port - 7 (or whatever 5 digit number below 65000 - make sure to write it down)
Destination port 7
Destination PC - Your PC by MAC / Name / IP

Click add rule
Name - WOL_9
Source port - 9 (or whatever 5 digit number below 65000 - make sure to write it down)
Destination port 9
Destination PC - Your PC by MAC / Name / IP

F) Configure Scheduled task

In 'Wake on Timer' section we set our PC to wake up at 8am every day.
Now we don't want the PC to run non-stop all day, but we also don't want it to go to sleep, because I've seen WOL fail if PC is sleeping. So instead, we are going to shut it down after 2 hours.

Click Start, type task, count to 3, and you will see 'Scheduled Task' appear.
Select it
Select tasks from the left side
Select 'new task'
Name it something descriptive like - Shut down PC after 2 hours of being awake.
Under Triggers tab, click 'add' or 'new' specify the time 2 hours after you've set it to automatically wake up.
Under Actions tab, click add.
- command will be 'cmd' (no quotes)
- switches will be '/c shutdown /s /t 30 /f' (no quotes)
-- cmd will open command prompt. /c will close it after command executed. /s is to shutdown /t 30 is to wait 30 seconds - if you're logged in, you'll get a warning and a chance to abort it. /f is to force - in case some app is stuck.
Under Advanced
- specify options to run the task if scheduled time was missed.
- specify to kill task if ran for more than 2 hours
- specify other options if may find relevant
- click check box to run using higher privileges.
When you press OK enter your password.

G) Configure Power Management

Press Start
Type power, count to 3, you will see "power management" appear towards the top of the list.
At the main menu, specify turn off monitors after 1 minute - you wont need them since you're connecting remote.
Under sleep specify never - you'll be turning PC off after 2 hours via task above
Under hibernate specify never - same reason as above
Click Advanced
Select 'performance plan'
Click OK till you exit

H) Test it.

Remote login and DNS
You can test it from the same PC, or if you have another one, test it from there.
**If you're testing from the computer you're connecting to, you wont be able to actually login, but if you get the login box, you're in a good place.

Open command prompt - Start+R, type CMD, press enter
type mstsc /
-- mstsc is a program for remote desktop
-- /v is a switch for address
-- : is the DNS name you chose yourself
-- RDPportNumber is the port in your router that will forward to 3389
-- example command :    mstsc /

you should get a login box.
-- If you get login box 99% of the stuff you configured is working.
You will see username and password and under domain you will have a name of the box you're presently on, not the name of the box you're connecting to.
Enter username as <computer name you're connecting to>\username
-- example: ALEX-PC\Alex2
Enter password
Press OK

Testing WOL

Turn computer OFF.
On another computer or cellphone, go to 
-- I've used it for years, but I dont actually know anything about them, or if it still works today
Enter necessary info
-- At this point you will realize you dont have a MAC address because I never told you to write it down.
-- Obtain and write down your MAC address

IP or Hostname -
-- example -
MAC will be your MAC 
-- example - 01-23-45-67-89-ab
Password / schedule / zone are optional
Press "Wake up my pc!" button

Count to 3 or maybe 42
Your PC should turn on.

Testing Wake on Schedule

You can either wait until the scheduled time and see if PC turns on
You can go back into the BIOS and change the wake up timer to around 2 minutes in the future.
Turn off PC
Count to 120
PC should turn on
Go into BIOS and Change the timer back to Zimbabwe 8am

Testing Scheduled task

Go to Windows
Right click on your task
Click Run
Count to 30
PC should turn off

Obligatory funny -

Plan Z

This is my new favorite recommendation because it works, but it's definitely the heavy handed approach.
- Make an account at (hosted chef - main url here
- Download and Install chef-client. Make sure to check the service check-box. 
-- Get files from (main url - )
- Connect (chef term is bootstrap) your home PC to your hosted Chef account.
-- This is probably the most 'complicated' step in the whole process. Best and simplest way is to head over to instead of me retyping same same stuff here.

Plan Z in action:
Once your PC is connected to Chef server, you will be able to add items to be executed on your PC from anywhere on the planet without actually ever having to connect to your PC. If your router changes, your PC gets stolen. As long as it's working and has access to internet, you can execute commands on it.

Monday, September 22, 2014

Working with Chef and Vagrant while on VPN

I've been beating my head against every object in my living room for the last 2 hours.

Last time this happened it turned out that the VPN I was on was blocking all 10.* addresses - in case you didnt realize, Vagrant assigns 10.* address by default. So simply changing the network IP address in Vagrantfile solved the problem.

Solution to internal network blocking 10.* addresses: :public_network, ip: ""

Well, something happened and everything stopped working again (I am a consultant for a company, so perhaps something got locked down just for contingent staff, because all of the FTEs can work just fine with default config).

Solution is to allow VPN passthrough in Vagrantfile:
... :private_network, ip: ""
config.vm.provider :virtualbox do |p|
    p.gui = true
    p.customize ["modifyvm", :id, "--memory", "1500", "--clipboard", "bidirectional", "--natdnshostresolver1", "on"]

* shamelessly stolen from a random (but very useful) site I found after banging my head against a keyboard failed to produce the desired outcome. (source: (I even kept the pretty code background)).

Random yet very accurate and completely unrelated funny

Tuesday, August 12, 2014

STIG, Windows2012, ACL, ACE, Registry, and aneurism

It's a very rewarding feeling, when after hours of research, google, and beating your head against the table, you finally come up with an answer. But I can't help feeling that 3 lines of code for 10 hours of research somehow diminish the sense of accomplishment. Not to say that I want 300 lines, but you know...

STIG and CIS benchmark documentation are as useful as they are impractical in the modern age. It's pages upon pages of useless manual steps. It's 2014.  TWENTY FOURTEEN!!  Unless you manage 4 systems that you NEVER rebuild, no one in their right mind is going to do this nonsense manually.

Example of nonsense:
Configure the policy value for Computer Configuration -> Windows Settings -> Security Settings -> Advanced Audit Policy Configuration -> System Audit Policies -> Global Object Access Auditing -> "File system" with the following: Select All Items.

The challenge at hand was automating implementation and validation of STIG V-1080:

Use the AuditPol tool to review the current configuration.Open a Command Prompt with elevated privileges ("Run as Administrator").  Enter "Auditpol /resourceSACL /type:File /view". ("File" in the /type parameter is case sensitive). The following results should be displayed:

Entry: 1
Resource Type: 

The command was successfully executed.

Apparently it's not at all easy to apply Auditing rules to registry via CMD or Powershell.
Microsoft documentation leaves a lot to your imagination.

And as always, the answer was a mixture of Stackoverflow and blogs:

Step 1. Set the Audit rules manually.
Step 2. Get the SDDL from registry
$acl = get-acl hklm:\software
Step 3. Apply SDDL via automation

$acl  = get-acl HKLM:\SOFTWARE
set-acl -Path HKLM:\\SOFTWARE -AclObject $acl

Step 4. Write Chef recipe and Serverspec integration test
Step 5. Realize that for some reason (either too much coffee or not enough) I confused two STIG rules and spent a mile walking in a direction of applying Auditing to HKLM:\Software instead of C:\ (Easy mistake to make I suppose)
Step 6. Change the script to actually apply auditing to all Drives instead.

Step 7. Make Chef recipe
Step 8. Make integration test via Serverspec

Useful links (in order of usefulness):


Tuesday, July 15, 2014

RHEL from a Windows guy

As ironic as it sounds, it wasnt until just a few months ago that I finally learned what RHEL stood for. That's 1994 - 2014. Nearly 20 years of blissful ignorance..

Now, I have a task to test a set of Chef recipes on RHEL 6.5 ... in Japanese.

So... questions are - 
1) How do I install RHEL?
2) How do I get it into Japanese?

... this is going to be as much of a brain dump, as a manual for myself to retrace my steps so that I can write a Chef cookbook automate everything when the time comes..

For the record - My workstation is a mac running OSX 10.9.4

1 - Getting RHEL box up.

Apparently you need some special license which you have to pay for. And I know nothing about it... nor does it have anything to do with what i'm trying to do. So to get some lift, I went with AWS.

Login, EC2 -> New server -> RHEL 6.5 -> Micro. Alternatively, you could spin up a CentOS box from which will be *almost* the exact same thing, but not really. So to avoid duplication I wanted actual RHEL. Also, you can use knife-ec2.

Even that has a tiny yak that one must shave.
  a) create an AWS key (aka AIM)
  b) save it locally
  c) fail to SSH using the key
  d) Google to find out that you need to change permissions on the key
  e) CHMOD the pem key 
  f) ssh success

Ok. I'm in, now to validate if my RHEL is the actual version I need it to be.
   a) attempt a few obvious commands.
   b) fail
   c) google (
   d) lsb_release -i -r
   e) above fails on some boxes, one alternative is to cat /etc/redhat-release

2 - How do I get it into Japanese

a) google...official documentation
b) read -

Per kickstart docs, installation doesn't support Japanese / Chinese / few other languages, and defaults to English, which is great. One less test case to worry about and theoretically I won't have to deal with Kickstart since all configs can be done post OS.

On a related note, it sounds like there are 4 language settings:
* keyboard
* UI
* shell
* runtime  
btw - example kickstart files can be found here -

c) fail.. system-config-language does not exist
..) google...
..) sudo yum install system-config-language
So.. i have no idea what the root password is, and I don't feel like resetting it or using EC2 UI again..
..) sudo su - 
..) system-config-language
..) scroll down..... tab tab.. ok!
..) Everything's still in English. now what? 

Turns out that by default, many linux distros will take on the locale of the system initiating the SSH connection. So there are a couple of other configuration steps needed to be done:

So in Summary:

  • lsb_release -i -r   or    cat /etc/redhat-release
  • sudo su -
  • yum install system-config-keyboard
  • yum install system-config-language
  • system-config-keyboard (should obviously be done first, otherwise the language selection is in Japanese.)
  • system-config-language
  • modify "keytable" value in  keyboard setting file /etc/sysconfig/keyboard (set it to jp106)

Keyboard maps available maps here -- /lib/kbd/keymaps/i386/qwerty/

Good discussion on the topic:
Finally found something very useful...confirming my suspicions that changes were only temporary unless some file was actually modified to set them in stone:

-------------random unarranged stuff below this line -----------

3 - other stuff

bootstrapping AWS box is a little bit tricker now than it was a few months ago
gotta use the pem key, so you can literally grab the -i switch from SSH command and dump it into knife bootstrap

knife bootstrap 54.208.333.333 -x ec2-user -i ../../Projects/Amazon/alex.pem --sudo