Wednesday, October 5, 2016

Modifying chef resources after they're already in resource collection

BOOM!

First, lets fire up chef-shell to demonstrate by creating a basic resource

$ chef-shell  
chef (12.14.57)> recipe_mode
chef:recipe (12.14.57)> file 'testing_edit' do
chef:recipe > content 'words'
chef:recipe ?> end
 => <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">

Easy way to modify resource collection

Now, I am going to modify this resource using a NEW resource  edit_resource
chef:recipe (12.14.57)>
chef:recipe >
chef:recipe > edit_resource(:file, 'testing_edit') do
chef:recipe > content 'different words'
chef:recipe ?> end
 => <file[testing_edit] @name: "testing_edit" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:create] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "different words">
chef:recipe (12.14.57)>

Coolness (remove resource from collection):

edit_resource(:file,'testing') do
chef:recipe > action :nothing
chef:recipe ?> end
 => <file[testing] @name: "testing" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :create, :delete, :touch, :create_if_missing] @action: [:nothing] @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: nil @default_guard_interpreter: :default @elapsed_time: 0 @sensitive: false @declared_type: :file @cookbook_name: nil @recipe_name: nil @content: "words">


Tuesday, October 4, 2016

Setting up chef Automate / Workflow (aka: delivery) in completely air gapped environment - level 1

Manual delivery install in airgapped env (AWS in Oregon)


Creation of Air-gapped environment



Create DHCP option set

Create vpc 'alexv-manual automate in airgapped env'
  set DNS resolution to Yes
  set DNS Hostname to Yes
  set DHCP option set to one above

Create a Windows 'jump box' inside VPC
  network: vpc above
  subnet - create new
    VPC - vpc above
    AZ - no pref
    CIDR - same as vpc
  refresh vpc field and select subnet
  assign public IP - true
  Network - default
  Storage - default
   (make sure you have enough free space to store all of the binaries needed inside VPC)
   (i used 40 gigs) - this may mean that you have to expand default hard drive to occupy full HD space
  Tag:
    Name - alexv-jump box
  SG:
    create new SG 'jump box'
    RDP - anywhere
    HTTP - anywhere
    HTTPS - anywhere
  Select your keypair
  ** On your local Mac inside RPD app, enable folder redirection when you add this box.
     set folder redirect to the location where your delivery.license file lives
  on the windows box, install filezilla - to make it easy to transfer files

Installing Delivery


Create Chef Server
  m3.medium
  VPC - same as above
  auto assign public ip - false
  storage - change to 30
  tag - alexv-chef-server
  SG:
    create new SG "chef server"
    open port 22
    open  All ICMP
    10`000-10`003
    8989
    HTTP
    HTTPS
  Keypair - select yours

Create Workflow server
  click on chef-server, select more like this
  VPC - same as above
  Subnet - internal subnet
  auto assign public ip - false
  storage - change to 30
  tag: alexv-Workflow-server
  SG:
    create new SG "Workflow server"
    open  port 22
    open  All ICMP
    10`000-10`003
    8989
    HTTPS
    HTTP
    (maybe needed?) 9200 - due to elastic search get errors
    (maybe needed?) 5672 - due to another elastic search failure?
  Keypair - select yours

Create Windows (or *nux) build node
  network: vpc above
  subnet
    VPC - vpc above
    AZ - no pref
    CIDR - same as vpc
  refresh vpc field and select subnet
  assign public IP - false
  Network - default
  Storage - default
  Tag:
    Name - alexv-windows build node
  SG:
    create new "windos build node"
    open RDP - anywhere
    open All ICMP
    open 5984-5986 anywhere (for rdp)
  Select your keypair

  Internet Gateway:
    create internet gateway - alexv-air gapped
    attach to VPC (above)

  Route:
    when you create VPC, it created a route table
    edit:
      add 0.0.0.0/0 -> point at internet gateway
    Save

Create 4 CentOS boxes to be environment nodes
  medium size
  HDD default
  SG - copy from workflow server
  Name SG "environment nodes"
  create

Create 2 CentOS boxes to be build nodes
  medium size
  HDD - 15 gigs
  SG - copy from workflow server
  Name SG "build nodes"
  create


Actually Install and Configure Automate

on the Chef Server and Automate node - follow directions
===================
disable ipv6 in /etc/hosts
make sure they can ping each other
make sure they can resolve dns of each other
make sure they cant access internet


Jump Box (or workstation)
===================
copy target os binaries into jump box: chef server, automate, push jobs server, chefdk, chef manage, supermarket if needed.
copy binaries to correct server /tmp folder
copy chefdk for use on workstation as a management node
setup user ssh auth
  ssh-keygen -t rsa -b 4096 -C "you@example.com"


Chef Server
===================
install chef server per directions
chef-server-ctl user-create alex alex alex@chef.io 'alexalex' --filename /tmp/alex_user.pem
chef-server-ctl org-create alex_org 'Fourth Coffee, Inc.' --association_user alex --filename /tmp/alex_org-validator.pem

install push jobs per directions:
  sudo chef-server-ctl install opscode-push-jobs-server --path /tmp/opscode-push-jobs-server.x86_64.rpm

sudo chef-server-ctl user-create delivery delivery user deliver@chef.io 'alexalex' --filename /tmp/delivery_user_key.pem
sudo chef-server-ctl org-create automate_org 'org description'  --filename /tmp/automate_org-validator.pem -a delivery

Install manage: (optional)
sudo chef-server-ctl install chef-manage --path /tmp/chef-manage-2.4.3-1.el6.x86_64.rpm
reconfigure chef, push, manage



on the Delivery server
===================
install delivery
setup command: sudo delivery-ctl setup \
                      --license /tmp/automate.license \
                      --fqdn ip-10-0-0-67.ec2.internal \
                      --key /tmp/chefserver/delivery_user_key.pem \
                      --server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org
copy all PEMs from chef server to delivery (validator, admin, delivery_user)
Enter name of your enterprise
  example: alex_ent
  (note: look for a bug here where enterprise is created, but admin creds are not displayed nor created in /etc/delivery/<enterprise-admin-credentials>)
  (if bugged) creat enterprise manually
    delivery-ctl create-enterprise alex_ent --ssh-pub-key-file=/etc/delivery/builder_key.pub
Copy ChefDk binary to /tmp/chefdk-0.18.30-1.el6.x86_64.rpm
install build node
  sudo delivery-ctl install-build-node -I /tmp/chefdk-0.18.30-1.el6.x86_64.rpm -f 10.0.0.23 -u chef -P chef

Verify build node works with `knife node status`
  this will query push jobs server for status of each node (different from knife status)
  available means push jobs can communicate with the node (you will know that at least push jobs is running at this point)
Verify you can fire off a push job:
  knife job start chef-client --search '*:*'

create user (via UI or CLI)

add public ssh key from workstations `ssh-keygen` step to the user
  delivery ui -> user -> ssh pub key

Jump Box (or workstation)
===================
Install chefdk
configure knife.rb with delivery key for communication with chef server
  example:
  node_name            'delivery'
  chef_server_url       "https://ip-10-0-0-80.ec2.internal/organizations/automate_org"
  client_key           'C:\Users\chef\.chef\delivery.pem'
  trusted_certs_dir    'C:\Users\chef\.chef\trusted_certs'
  # analytics_server_url 'https://cad-chef-server/organizations/cad'
  cookbook_path 'C:\Users\chef\chef-demo\cookbooks'

fetch certs if needed
  knife ssl fetch

verify knife works
 knife node list
 (or from delivery server)
  knife node list -k /etc/delivery/delivery.pem -u delivery --server-url https://ip-10-0-0-80.ec2.internal/organizations/automate_org

Pull down all of the cookbook dependencies to be used in air-gapped env (i do it via berks)
  mkdir repo
  cd repo
  chef generate cookbook staging (this will be the first test cookbook)
  modify metadata.rb of seeding cookbook to include:
    depends 'delivery-truck'
    depends 'push-jobs'
    depends 'build_cookbook'
    depends 'delivery_build'
  mkdir seeding
  cd staging\.delivery\build_cookbook
  run `berks vendor ..\..\..\seeding` to pull down all dependencies into a local folder

upload necessary cookbooks up to chef server
  knife cookbook upload -o seeding -a
  (or alternatively `knife cookbook upload delivery-truck --include-dep -o seeding`

test ssh auth to delivery box
  ssh -l alex@alex_ent -p 8989 ip-10-0-0-67.ec2.internal

Configure delivery cmd - C:\Users\chef\cookbooks\staging\.delivery\cli.toml
  in root of staging cookbook$ delivery setup -e alex_ent -o automate_org -s
 ip-10-0-0-67.ec2.internal -u alex

make sure you can interact with delivery via delivery cli:
  Verify API works
    delivery api get users
    delivery api get orgs
  verify you can create a project
    create a cookbook
    `delivery init` inside that cookbook


First pipeline
===================
i'll use staging cookbook as it's a nice example
initialize delivery pipeline
  inside staging cookbook run `delivery init`
bump metadata.rb if needed
modify config.json to exclude spec and test folders due to foodcritic testing them, leading to workflow epic failing on linting phase.
  $ cat config.json
      {
        "version": "2",
        "build_cookbook": {
          "name": "build_cookbook",
          "path": ".delivery/build_cookbook"
        },
        "delivery-truck":{
          "lint": {
            "foodcritic": {
              "excludes" : ["spec","test"]
            }
          }
        },
        "skip_phases": [],
        "build_nodes": {},
        "dependencies": []
      }
change Berksfile (of build cookbook)
  Since you're not connected to internetz, you'll fail all phases of workflow due to Berksfile
  change source to :chef_server
    $ cat Berksfile
      source :chef_server
      # or your internal supermarket
      metadata

      group :delivery do
        cookbook 'delivery_build'#, chef_api: :config
        cookbook 'delivery-base'#, chef_api: :config
        cookbook 'test', path: './test/fixtures/cookbooks/test'
      end

      #original
      # group :delivery do
      #   cookbook 'delivery_build', git: 'https://github.com/chef-cookbooks/delivery_build'
      #   cookbook 'delivery-base', git: 'https://github.com/chef-cookbooks/delivery-base'
      #   cookbook 'test', path: './test/fixtures/cookbooks/test'
      # end

add and commit changes
  git add -u
  git commit -m 'very descriptive comment'
  delivery review


Bill of Materials:
===================
Filezilla (windows) - management node
Chef-server-core-12.9.1
delivery-core-0.5.346
push-jobs-1.1.6
chefdk-chefdk-0.18.30-1.el6.x86_64.rpm
  note: seems like chefdk 17.17 doesnt work in isolated environment with a Yajl error
chefdk-18.30 for windows
chef manage rpm
supermarket rpm
berks vendor of `build cookbook`
  should include all of the following:
     build-essential
     build_cookbook
     chef-ingredient
     chef-sugar
     chef_handler
     compat_resource
     delivery-base
     delivery-sugar
     delivery-truck
     delivery_build
     dmg
     git
     mingw
     packagecloud
     push-jobs
     runit
     seven_zip
     test
     windows
     yum
     yum-epel


troubleshooting.
================

*) The setup command *may* create an enterprise for you. If you see that behavior, and do not get credentials as an output, you will have to delete the enterprise, and create it again using create-enterprise command.

*) node create command installs push jobs via this script:
https://github.com/chef/delivery/blob/114649cc8d6ddbf494a9666ef476e6a4b8523a7f/omnibus/files/ctl-commands/installer/gen_push_config.sh
..which is called by this script:
https://github.com/chef/delivery/blob/2ab9d4809e4ac1f237b52ee20088b1ac68d85af4/omnibus/files/ctl-commands/build-node/installer.rb#L217


Thursday, August 11, 2016

Getting started with DevOps - Basic Chef (and any other CI/CD environment)

A got a question a few weeks back, and I think the answer is worth sharing.


  • Do we need to have a chef workstation hosted in our cloud environment that everyone logs in to (something like a jump box configured with chefdk and all the plugins), or can users spin up VMs off their own machine and that is used as the Chef Workstation? 
  • If spinning up VMs off our machine, how do we connect to chef-repo which I’m setting up in AWS? 
  • We need to connect to git for source control. I have set up an enterprise git instance - how do I change the cookbooks to connect to our instance of git?


This is going to be a lot of words, because there is no easy answer...
What one starting a similar journey should do though, is take the below suggestions, and run through them iteratively. Version 1, ver 2, 3...etc. Don't try to do everything at once.

Also, https://github.com/chef-customers/dojo-assessment-guide is a fantastic tool to figure out where you are in the DevOps journey, and where others typically go.

Git:

Source control will be your base, so it's first in the list.

For git, there are a lot of articles about "forking the code" and the eventual price of having done that. So, when it comes to community cookbooks, best course of action is - don't fork community cookbooks (or at the very least don't fork it for very long).

If you rely on a community cookbook, take it, along with the full git history, and upload it into your private git and your chef server. Any changes to community cookbooks should be done via `wrapper cookbook` (mycompany_apache for apache, etc..).

If some feature is not supported (or you found a bug), make a change and ASAP push the change back to the community via PR - this is so you don't have to maintain the fork, and can take advantage of improvements in the public version. (example: you're using apache 6 with your private hotfix, apache 7 comes out, and it's drastically different. Due to your custom changes, you need to spend 20 hours merging the versions and applying any custom hotfixes you've accumulated. You further fork the code to make it work. You spend all your days fixing bugs. Your head hurts from drinking too much coffee....)

Chef repo:

So, a good pattern is to have a git repo called chef_base or mycompany_base, etc..
Usually, a user starts by cloning the this repo locally. Any updates that would affect the whole company would be pushed back to git, so every user can benefit from it.

In the chef-base you'll have your environments folder with various environemnts, roles with roles, and a folder called cookbooks (execute `chef generate repo test` for a basic example). You would have chefignore and .gitignore filled out as per your org rules. You would then do something about .chef folder - either have a symlink that points to a known location, use a known variable to load the file, or leave it up the the user to fill in the details. Typically the .kitchen or vagrant file to stand up a cloud workstation would live here (more on this later). The \cookbooks folder is either empty or has a global cookbook like chef_workstation in there.

So, when a user starts working with chef, they clone chef_base and have everything they need to get started. All they do, is go into \cookbooks folder, and git clone the thing they want to work on.  This keeps chef-repo and each cookbook they work on completely independent. If they want to add new community cookbooks to your org, they follow the same process as above: clone community cookbook into /cookbooks and push it internally.

Chef workstation:

So, you definitely want to have each user have their own cloud workstation. Also, they should have the ability to create/destroy them whenever they need to. On average, workstations don't survive longer than a week (not should they).

Locally is pretty easy. Use test kitchen, or vagrant, mount the local chef-repo folder inside the VM and you're done.
(Here is how kitchen would work: https://github.com/test-kitchen/kitchen-vagrant#-synced_folders)
(here is how vagrant would work: https://www.vagrantup.com/docs/synced-folders/basic_usage.html)

With Amazon/Azure/Aws/Vmware/etc.. mounting a folder is done differently. In this scenario, when users run Test Kitchen, it would create additional VMs in the cloud. You'll need to setup a sync mechanism if your virtualization platform doesn't support mounting local folder on a VM. You could give users an EBS volume they could share across workstation and local dev machine. Or just a regular network share they can mount locally and on a workstation.  Also, I heard https://atom.io/packages/remote-sync works really well, however I never touched it personally.

Key takeaway here is that lots of companies are going down the VM road.

The long version is that this decision will be guided by a couple of factors - how powerful your users workstations are, your companies business direction, how much money your DevOps initiative has been given.

What I've seen in the wild is very interesting. The best and the worst shops use nearly identical workstation hardware. On the one end of the bell curve, there are companies where employees have 2 gig laptops incapable of opening notepad in under 30 seconds. All work is done locally, and is very painful. On the other end, you also have 2 gig laptops - though typically surface and mac book minis - however, in this case, all of the work is done in the cloud, and these machines are plenty powerful since they are simply used to RDP into remote resources (and used for skype and facebook the rest of the time).


Hope that helps.
Alex-

Monday, April 25, 2016

Resetting opscode-reporting password

One in a while upgrading opscode-reporting goes wrong. Or it doesnt start, you do manual clean up, and basically passwords go out of sync.

Solution is pretty straight forward - reset the passwords to what the system things the passwords should be.

1.
Open up /etc/opscode-reporting/opscode-reporting-secrets.json
Grab opscode_reporting & opscode_reporting_ro passwords and pipe them to opscode-pgsql

echo "ALTER USER opscode_reporting PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'
echo "ALTER USER opscode_reporting_ro PASSWORD 'XXXXX' " | su -l opscode-pgsql -c 'psql'

You should get the result "ALTER ROLE" from each of the 'echo' commands

2.
Next, make sure rabbitmq password is in sync:
In the same .json file, in the "opscode_reporting" section, grab the "rabbitmq_password" and use it in place of XXXXX

PATH=/opt/opscode/embedded/bin:$PATH rabbitmqctl change_password runs XXXXX

3.
then chef-server-ctl restart opscode-reporting


4.
And finally, you might still be broken.
If you look at the process list and see an error similar to below, send the HUP to svlogd to reload the configs.


root      1456  0.0  0.0   4092   196 ?        Ss    2015   3:12 runsvdir -P /opt/opscode/service log: vlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?svlogd: pausing: unable to rename current: /var/log/opscode/opscode-reporting: file does not exist?

So grab the correct pid by running chef-server-ctl status

...
run: opscode-reporting: (pid 17407) 30088s; run: log: (pid 32415) 88051s
...

kill -HUP 32415


Sunday, April 3, 2016

Chef - Passing output of one resource as input to the next

There are a couple of ways to do that.

One is via lazy

directory '/tmp' 
file '/tmp/alex.txt' do
  content 'sudo make me a sandwitch'
end 
ruby_block "something" do
    block do
      command = 'cat /tmp/alex.txt'
      command_out = shell_out(command)
      node.set['a'] = command_out.stdout
    end
    action :create
end 
file '/tmp/alex2.txt' do
  action :create
  owner 'root'
  group 'root'
  mode '0644'
  content lazy { node['a'] }
end

Thursday, March 31, 2016

Powershell and chef - how to view the script powershell_script resource is executing

So, I was troubleshooting passing arrays to powershell_script resource.

Troubleshooting powershell_script

First - the easy way. Just run chef-client -l debug. In debug logging, you can see the whole script, which might be enough.

What makes troubleshooting powershell_script difficult, is the way it works from inside chef. A temporary file is created, and immediately nuked after execution, making it somewhat difficult to see exactly what happened.

After some messing around, I realized a simple trick:
powershell_script 'script_name' do
  code <<-EOH
    copy $MyInvocation.ScriptName c:/chef_powershell_staging_file.ps1  EOH
end

Passing array node attributes to powershell_script:

Seems that in defining a generic array, ruby inserts square brackets [ ] which actually become part of the string when consumed by powershell_script, and powershell chokes on it.
default['test_cookbook']['array'] = 'string1','string2'default['test_cookbook']['array'] = %w(string1,string2)
In both of the above, Powershell will either throw an error or generally not work as expected
Missing type name after '[' 
What actually happens, is during resource declaration phase, the square brackets get escaped (you can see it via chef-shell by creating a simple bash or powershell_script resource)

chef:attributes (12.8.1)> default['test_cookbook']['array'] = 'string1','string2'=> ["string1", "string2"]
for example bash:
chef:recipe >bash 'some-bash' do
chef:recipe > code <<-EOH
chef:recipe"> $Array = #{node['test_cookbook']['array']}
chef:recipe"> EOH
chef:recipe ?> end
=> <bash[Set-VariableArray] .... @code: " $Array = [\"string1\", \"string2\"] \n" ... 
using native ruby:
attribute:
default['a']['b'] = %w(a,b,c)
keeping the recipe the same, the resulting code will be:
... @code: " $Array = [\"a,b,c\"] \n" ... 

Solution - simple in retrospect - double quotes:
node.default['a'] = "'value1', 'value2', 'value3'"
In your recipe, you'll get an actual powershell array:

powershell_script 'script_name' do
  code <<-EOH
    Out-File -InputObject "#{node['a']}".GetType() c:/out.txt
  EOH
end

Tuesday, February 16, 2016

Governments entering IT at glacier speeds.

What happens when you give powerful tools to people with low motivation?

They create products that make it attractive to go back to filling out stacks of paper forms by hand.

Problem of the day:
I was filling out a Visa application for entering Japan. On the plus side, the file is a PDF and I can actually type the information into them. All of the fields are present. That's another plus.

The minus, is that input validation is broken on some fields. For example, of the 5 phone number entry fields, only 3 allow dashes. One of the date entry fields does not actually allow entry. Another date entry only allows 3 digits for the year.

But that's actually not bad.
What's bad, is that I could not print the thing. There was something literally not allowing me to print my own document.





This absolutely blew my mind. I attempted to print a few more times in complete disbelief for what was happening, before accepting the yak shave ahead of me as one of my own. First the obvious, was Ctrl+P different from File -> Print? Sadly, same result. (if you recall, Chrome takes over your print functionality)






So, can I just turn it off somewhere? Yes! Edit -> Preferences (it even has a hotkey!!)




And that is the Story of how I printed a form, that was likely butchered due to some government compliance rules on PDF security by people who were not given any autonomy and probably no explanation.

Homeownership