Release Engineer - organizing chaos in the world of IT: 2016

Oct 5, 2016

Modifying chef resources after they're already in a resource collection

BOOM!

First, lets fire up chef-shell to demonstrate by creating a basic resource

$ chef-shell

chef (12.14.57)> recipe_mode

chef:recipe (12.14.57)> file 'testing_edit' do

chef:recipe > content 'words'

chef:recipe ?> end

=> <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">

Easy way to modify resource collection

Now, I am going to modify this resource using a NEW resource edit_resource

chef:recipe (12.14.57)>

chef:recipe >

chef:recipe > edit_resource(:file, 'testing_edit') do

chef:recipe > content 'different words'

chef:recipe ?> end

=> <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "different words">

chef:recipe (12.14.57)>

Coolness (remove resource from collection):

edit_resource(:file,'testing') do
chef:recipe > action :nothing
chef:recipe ?> end

=> <file[testing] @name: "testing" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:nothing] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">

Oct 4, 2016

Setting up chef Automate / Workflow (aka: delivery) in completely air gapped environment - level 1

Manual delivery install in airgapped env (AWS in Oregon)

Creation of Air-gapped environment

Create DHCP option set

Create vpc 'alexv-manual automate in airgapped env'
set DNS resolution to Yes
set DNS Hostname to Yes
set DHCP option set to one above

Create a Windows 'jump box' inside VPC
network: vpc above
subnet - create new
VPC - vpc above
AZ - no pref
CIDR - same as vpc
refresh vpc field and select subnet
assign public IP - true
Network - default
Storage - default
(make sure you have enough free space to store all of the binaries needed inside VPC)
(i used 40 gigs) - this may mean that you have to expand default hard drive to occupy full HD space
Tag:
Name - alexv-jump box
SG:
create new SG 'jump box'
RDP - anywhere
HTTP - anywhere
HTTPS - anywhere
Select your keypair
** On your local Mac inside RPD app, enable folder redirection when you add this box.
set folder redirect to the location where your delivery.license file lives
on the windows box, install filezilla - to make it easy to transfer files

Installing Delivery

Create Chef Server
m3.medium
VPC - same as above
auto assign public ip - false
storage - change to 30
tag - alexv-chef-server
SG:
create new SG "chef server"
open port 22
open All ICMP
10`000-10`003
8989
HTTP
HTTPS
Keypair - select yours

Create Workflow server
click on chef-server, select more like this
VPC - same as above
Subnet - internal subnet
auto assign public ip - false
storage - change to 30
tag: alexv-Workflow-server
SG:
create new SG "Workflow server"
open port 22
open All ICMP
10`000-10`003
8989
HTTPS
HTTP
(maybe needed?) 9200 - due to elastic search get errors
(maybe needed?) 5672 - due to another elastic search failure?
Keypair - select yours

Create Windows (or *nux) build node
network: vpc above
subnet
VPC - vpc above
AZ - no pref
CIDR - same as vpc
refresh vpc field and select subnet
assign public IP - false
Network - default
Storage - default
Tag:
Name - alexv-windows build node
SG:
create new "windos build node"
open RDP - anywhere
open All ICMP
open 5984-5986 anywhere (for rdp)
Select your keypair

Internet Gateway:
create internet gateway - alexv-air gapped
attach to VPC (above)

Route:
when you create VPC, it created a route table
edit:
add 0.0.0.0/0 -> point at internet gateway
Save

Create 4 CentOS boxes to be environment nodes
medium size
HDD default
SG - copy from workflow server
Name SG "environment nodes"
create

Create 2 CentOS boxes to be build nodes
medium size
HDD - 15 gigs
SG - copy from workflow server
Name SG "build nodes"
create

Actually Install and Configure Automate

on the Chef Server and Automate node - follow directions
===================
disable ipv6 in /etc/hosts
make sure they can ping each other
make sure they can resolve dns of each other
make sure they cant access internet

Jump Box (or workstation)
===================
copy target os binaries into jump box: chef server, automate, push jobs server, chefdk, chef manage, supermarket if needed.
copy binaries to correct server /tmp folder
copy chefdk for use on workstation as a management node
setup user ssh auth
ssh-keygen -t rsa -b 4096 -C "you@example.com"

Chef Server
===================
install chef server per directions
chef-server-ctl user-create alex alex alex@chef.io 'alexalex' --filename /tmp/alex_user.pem
chef-server-ctl org-create alex_org 'Fourth Coffee, Inc.' --association_user alex --filename /tmp/alex_org-validator.pem

install push jobs per directions:
sudo chef-server-ctl install opscode-push-jobs-server --path /tmp/opscode-push-jobs-server.x86_64.rpm

sudo chef-server-ctl user-create delivery delivery user deliver@chef.io 'alexalex' --filename /tmp/delivery_user_key.pem
sudo chef-server-ctl org-create automate_org 'org description' --filename /tmp/automate_org-validator.pem -a delivery

Install manage: (optional)
sudo chef-server-ctl install chef-manage --path /tmp/chef-manage-2.4.3-1.el6.x86_64.rpm
reconfigure chef, push, manage

on the Delivery server
===================
install delivery
setup command: sudo delivery-ctl setup \
--license /tmp/automate.license \
--fqdn ip-10-0-0-67.ec2.internal \
--key /tmp/chefserver/delivery_user_key.pem \
--server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org
copy all PEMs from chef server to delivery (validator, admin, delivery_user)
Enter name of your enterprise
example: alex_ent
(note: look for a bug here where enterprise is created, but admin creds are not displayed nor created in /etc/delivery/<enterprise-admin-credentials>)
(if bugged) creat enterprise manually
delivery-ctl create-enterprise alex_ent --ssh-pub-key-file=/etc/delivery/builder_key.pub
Copy ChefDk binary to /tmp/chefdk-0.18.30-1.el6.x86_64.rpm
install build node
sudo delivery-ctl install-build-node -I /tmp/chefdk-0.18.30-1.el6.x86_64.rpm -f 10.0.0.23 -u chef -P chef

Verify build node works with `knife node status`
this will query push jobs server for status of each node (different from knife status)
available means push jobs can communicate with the node (you will know that at least push jobs is running at this point)
Verify you can fire off a push job:
knife job start chef-client --search '*:*'

create user (via UI or CLI)

add public ssh key from workstations `ssh-keygen` step to the user
delivery ui -> user -> ssh pub key

Jump Box (or workstation)
===================
Install chefdk
configure knife.rb with delivery key for communication with chef server
example:
node_name 'delivery'
chef_server_url "https://ip-10-0-0-80.ec2.internal/organizations/automate_org"
client_key 'C:\Users\chef\.chef\delivery.pem'
trusted_certs_dir 'C:\Users\chef\.chef\trusted_certs'
# analytics_server_url 'https://cad-chef-server/organizations/cad'
cookbook_path 'C:\Users\chef\chef-demo\cookbooks'

fetch certs if needed
knife ssl fetch

verify knife works
knife node list
(or from delivery server)
knife node list -k /etc/delivery/delivery.pem -u delivery --server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org

Pull down all of the cookbook dependencies to be used in air-gapped env (i do it via berks)
mkdir repo
cd repo
chef generate cookbook staging (this will be the first test cookbook)
modify metadata.rb of seeding cookbook to include:
depends 'delivery-truck'
depends 'push-jobs'
depends 'build_cookbook'
depends 'delivery_build'
mkdir seeding
cd staging\.delivery\build_cookbook
run `berks vendor ..\..\..\seeding` to pull down all dependencies into a local folder

upload necessary cookbooks up to chef server
knife cookbook upload -o seeding -a
(or alternatively `knife cookbook upload delivery-truck --include-dep -o seeding`

test ssh auth to delivery box
ssh -l alex@alex_ent -p 8989 ip-10-0-0-67.ec2.internal

Configure delivery cmd - C:\Users\chef\cookbooks\staging\.delivery\cli.toml
in root of staging cookbook$ delivery setup -e alex_ent -o automate_org -s
ip-10-0-0-67.ec2.internal -u alex

make sure you can interact with delivery via delivery cli:
Verify API works
delivery api get users
delivery api get orgs
verify you can create a project
create a cookbook
`delivery init` inside that cookbook

First pipeline
===================
i'll use staging cookbook as it's a nice example
initialize delivery pipeline
inside staging cookbook run `delivery init`
bump metadata.rb if needed
modify config.json to exclude spec and test folders due to foodcritic testing them, leading to workflow epic failing on linting phase.
$ cat config.json
{
"version": "2",
"build_cookbook": {
"name": "build_cookbook",
"path": ".delivery/build_cookbook"
},
"delivery-truck":{
"lint": {
"foodcritic": {
"excludes" : ["spec","test"]
}
}
},
"skip_phases": [],
"build_nodes": {},
"dependencies": []
}
change Berksfile (of build cookbook)
Since you're not connected to internetz, you'll fail all phases of workflow due to Berksfile
change source to :chef_server
$ cat Berksfile
source :chef_server
# or your internal supermarket
metadata

group :delivery do
cookbook 'delivery_build'#, chef_api: :config
cookbook 'delivery-base'#, chef_api: :config
cookbook 'test', path: './test/fixtures/cookbooks/test'
end

#original
# group :delivery do
# cookbook 'delivery_build', git: 'https://github.com/chef-cookbooks/delivery_build'
# cookbook 'delivery-base', git: 'https://github.com/chef-cookbooks/delivery-base'
# cookbook 'test', path: './test/fixtures/cookbooks/test'
# end

add and commit changes
git add -u
git commit -m 'very descriptive comment'
delivery review

Bill of Materials:
===================
Filezilla (windows) - management node
Chef-server-core-12.9.1
delivery-core-0.5.346
push-jobs-1.1.6
chefdk-chefdk-0.18.30-1.el6.x86_64.rpm
note: seems like chefdk 17.17 doesnt work in isolated environment with a Yajl error
chefdk-18.30 for windows
chef manage rpm
supermarket rpm
berks vendor of `build cookbook`
should include all of the following:
build-essential
build_cookbook
chef-ingredient
chef-sugar
chef_handler
compat_resource
delivery-base
delivery-sugar
delivery-truck
delivery_build
dmg
git
mingw
packagecloud
push-jobs
runit
seven_zip
test
windows
yum
yum-epel

troubleshooting.
================

*) The setup command *may* create an enterprise for you. If you see that behavior, and do not get credentials as an output, you will have to delete the enterprise, and create it again using create-enterprise command.

*) node create command installs push jobs via this script:
https://github.com/chef/delivery/blob/114649cc8d6ddbf494a9666ef476e6a4b8523a7f/omnibus/files/ctl-commands/installer/gen_push_config.sh
..which is called by this script:
https://github.com/chef/delivery/blob/2ab9d4809e4ac1f237b52ee20088b1ac68d85af4/omnibus/files/ctl-commands/build-node/installer.rb#L217

Aug 11, 2016

Getting started with DevOps - Basic Chef (and any other CI/CD environment)

A got a question a few weeks back, and I think the answer is worth sharing.

Do we need to have a chef workstation hosted in our cloud environment that everyone logs in to (something like a jump box configured with chefdk and all the plugins), or can users spin up VMs off their own machine and that is used as the Chef Workstation?
If spinning up VMs off our machine, how do we connect to chef-repo which I’m setting up in AWS?
We need to connect to git for source control. I have set up an enterprise git instance - how do I change the cookbooks to connect to our instance of git?

This is going to be a lot of words, because there is no easy answer...
What one starting a similar journey should do though, is take the below suggestions, and run through them iteratively. Version 1, ver 2, 3...etc. Don't try to do everything at once.

Also, https://github.com/chef-customers/dojo-assessment-guide is a fantastic tool to figure out where you are in the DevOps journey, and where others typically go.

Git:

Source control will be your base, so it's first in the list.

For git, there are a lot of articles about "forking the code" and the eventual price of having done that. So, when it comes to community cookbooks, best course of action is - don't fork community cookbooks (or at the very least don't fork it for very long).

If you rely on a community cookbook, take it, along with the full git history, and upload it into your private git and your chef server. Any changes to community cookbooks should be done via `wrapper cookbook` (mycompany_apache for apache, etc..).

If some feature is not supported (or you found a bug), make a change and ASAP push the change back to the community via PR - this is so you don't have to maintain the fork, and can take advantage of improvements in the public version. (example: you're using apache 6 with your private hotfix, apache 7 comes out, and it's drastically different. Due to your custom changes, you need to spend 20 hours merging the versions and applying any custom hotfixes you've accumulated. You further fork the code to make it work. You spend all your days fixing bugs. Your head hurts from drinking too much coffee....)

Chef repo:

So, a good pattern is to have a git repo called chef_base or mycompany_base, etc..
Usually, a user starts by cloning the this repo locally. Any updates that would affect the whole company would be pushed back to git, so every user can benefit from it.

In the chef-base you'll have your environments folder with various environemnts, roles with roles, and a folder called cookbooks (execute `chef generate repo test` for a basic example). You would have chefignore and .gitignore filled out as per your org rules. You would then do something about .chef folder - either have a symlink that points to a known location, use a known variable to load the file, or leave it up the the user to fill in the details. Typically the .kitchen or vagrant file to stand up a cloud workstation would live here (more on this later). The \cookbooks folder is either empty or has a global cookbook like chef_workstation in there.

So, when a user starts working with chef, they clone chef_base and have everything they need to get started. All they do, is go into \cookbooks folder, and git clone the thing they want to work on. This keeps chef-repo and each cookbook they work on completely independent. If they want to add new community cookbooks to your org, they follow the same process as above: clone community cookbook into /cookbooks and push it internally.

Chef workstation:

So, you definitely want to have each user have their own cloud workstation. Also, they should have the ability to create/destroy them whenever they need to. On average, workstations don't survive longer than a week (not should they).

Locally is pretty easy. Use test kitchen, or vagrant, mount the local chef-repo folder inside the VM and you're done.
(Here is how kitchen would work: https://github.com/test-kitchen/kitchen-vagrant#-synced_folders)
(here is how vagrant would work: https://www.vagrantup.com/docs/synced-folders/basic_usage.html)

With Amazon/Azure/Aws/Vmware/etc.. mounting a folder is done differently. In this scenario, when users run Test Kitchen, it would create additional VMs in the cloud. You'll need to setup a sync mechanism if your virtualization platform doesn't support mounting local folder on a VM. You could give users an EBS volume they could share across workstation and local dev machine. Or just a regular network share they can mount locally and on a workstation. Also, I heard https://atom.io/packages/remote-sync works really well, however I never touched it personally.

Key takeaway here is that lots of companies are going down the VM road.

The long version is that this decision will be guided by a couple of factors - how powerful your users workstations are, your companies business direction, how much money your DevOps initiative has been given.

What I've seen in the wild is very interesting. The best and the worst shops use nearly identical workstation hardware. On the one end of the bell curve, there are companies where employees have 2 gig laptops incapable of opening notepad in under 30 seconds. All work is done locally, and is very painful. On the other end, you also have 2 gig laptops - though typically surface and mac book minis - however, in this case, all of the work is done in the cloud, and these machines are plenty powerful since they are simply used to RDP into remote resources (and used for skype and facebook the rest of the time).

Hope that helps.
Alex-

Apr 25, 2016

Resetting opscode-reporting password

One in a while upgrading opscode-reporting goes wrong. Or it doesnt start, you do manual clean up, and basically passwords go out of sync.

Solution is pretty straight forward - reset the passwords to what the system things the passwords should be.

1.
Open up /etc/opscode-reporting/opscode-reporting-secrets.json
Grab opscode_reporting & opscode_reporting_ro passwords and pipe them to opscode-pgsql

echo "ALTER USER opscode_reporting PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'
echo "ALTER USER opscode_reporting_ro PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'

You should get the result "ALTER ROLE" from each of the 'echo' commands

2.
Next, make sure rabbitmq password is in sync:
In the same .json file, in the "opscode_reporting" section, grab the "rabbitmq_password" and use it in place of XXXXX

PATH=/opt/opscode/embedded/bin:$PATH rabbitmqctl change_password runs XXXXX

3.
then chef-server-ctl restart opscode-reporting

4.
And finally, you might still be broken.
If you look at the process list and see an error similar to below, send the HUP to svlogd to reload the configs.

root 1456 0.0 0.0 4092 196 ? Ss 2015 3:12 runsvdir -P /opt/opscode/service log: vlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?

So grab the correct pid by running chef-server-ctl status

...
run: opscode-reporting: (pid 17407) 30088s; run: log: (pid 32415) 88051s
...

kill -HUP 32415

Apr 3, 2016

Chef - Passing output of one resource as input to the next

There are a couple of ways to do that.

One is via lazy

directory '/tmp'

file '/tmp/alex.txt' do
content 'sudo make me a sandwitch'
end

ruby_block "something" do
block do
command = 'cat /tmp/alex.txt'
command_out = shell_out(command)
node.set['a'] = command_out.stdout
end
action :create
end

file '/tmp/alex2.txt' do
action :create
owner 'root'
group 'root'
mode '0644'
content lazy { node['a'] }
end

Mar 31, 2016

Powershell and chef - how to view the script powershell_script resource is executing

So, I was troubleshooting passing arrays to powershell_script resource.

Troubleshooting powershell_script

First - the easy way. Just run chef-client -l debug. In debug logging, you can see the whole script, which might be enough.

What makes troubleshooting powershell_script difficult, is the way it works from inside chef. A temporary file is created, and immediately nuked after execution, making it somewhat difficult to see exactly what happened.

After some messing around, I realized a simple trick:

powershell_script 'script_name' do
code <<-EOH
copy $MyInvocation.ScriptName c:/chef_powershell_staging_file.ps1 EOH
end

Passing array node attributes to powershell_script:

Seems that in defining a generic array, ruby inserts square brackets [ ] which actually become part of the string when consumed by powershell_script, and powershell chokes on it.

default['test_cookbook']['array'] = 'string1','string2'

default['test_cookbook']['array'] = %w(string1,string2)

In both of the above, Powershell will either throw an error or generally not work as expected

Missing type name after '['

What actually happens, is during resource declaration phase, the square brackets get escaped (you can see it via chef-shell by creating a simple bash or powershell_script resource)

chef:attributes (12.8.1)> default['test_cookbook']['array'] = 'string1','string2'=> ["string1", "string2"]

for example bash:
chef:recipe >bash 'some-bash' do
chef:recipe > code <<-EOH
chef:recipe"> $Array = #{node['test_cookbook']['array']}
chef:recipe"> EOH
chef:recipe ?> end
=> <bash[Set-VariableArray] .... @code: " $Array = [\"string1\", \"string2\"] \n" ...

using native ruby:
attribute:
default['a']['b'] = %w(a,b,c)
keeping the recipe the same, the resulting code will be:
... @code: " $Array = [\"a,b,c\"] \n" ...

Solution - simple in retrospect - double quotes:

node.default['a'] = "'value1', 'value2', 'value3'"

In your recipe, you'll get an actual powershell array:

powershell_script 'script_name' do
code <<-EOH
Out-File -InputObject "#{node['a']}".GetType() c:/out.txt
EOH
end

Feb 16, 2016

Governments entering IT at glacier speeds.

What happens when you give powerful tools to people with low motivation?

They create products that make it attractive to go back to filling out stacks of paper forms by hand.

Problem of the day:
I was filling out a Visa application for entering Japan. On the plus side, the file is a PDF and I can actually type the information into them. All of the fields are present. That's another plus.

The minus, is that input validation is broken on some fields. For example, of the 5 phone number entry fields, only 3 allow dashes. One of the date entry fields does not actually allow entry. Another date entry only allows 3 digits for the year.

But that's actually not bad.
What's bad, is that I could not print the thing. There was something literally not allowing me to print my own document.

This absolutely blew my mind. I attempted to print a few more times in complete disbelief for what was happening, before accepting the yak shave ahead of me as one of my own. First the obvious, was Ctrl+P different from File -> Print? Sadly, same result. (if you recall, Chrome takes over your print functionality)

So, can I just turn it off somewhere? Yes! Edit -> Preferences (it even has a hotkey!!)

And that is the Story of how I printed a form, that was likely butchered due to some government compliance rules on PDF security by people who were not given any autonomy and probably no explanation.

Homeownership