Friday, October 25, 2013

DNS error after migrating Chef from version 10 to 11

This week I'm testing any possible errors if we migrate our Chef to version 11.0.6 - I know there's version 11.0.8 available but is giving out a ruby timezone error on ubuntu that I need to deal with yet.

So far the migration seems smooth, following the steps on the official guide. Restoring the backup was ok and a testing client was able to communicate with the server. However, after modifying a cookbook I was getting this error when applying the template change:

FATAL: SocketError: template[/etc/rsyslog.d/22-XXXXXX.conf] (Deploy_configuration_na_olive-logging-rsyslog::default line 13) had an error: SocketError: Error connecting to https://ip-10-XX-XX-XX.us-west-5.compute.internal/bookshelf/organization-00000000000000000000000000000000/checksum-b3f32a70cedbe6de9ac38?AWSAccessKeyId=XXXXXXXXX&Expires=1382603029&Signature=XXXXXXXX - getaddrinfo: Name or service not known

Straigh forward we see the message getaddrinfo: Name or service not known. Being an AWS EC2 instance, it was unable to resolve the name because the servers are in different geographic regions. However that's not the issue, the problem is that the chef server is using the internal DNS for it's settings.

To rectify this, modify (or create) the file /etc/chef-server/chef-server.rb with this content:

server_name = "<your chef server external DNS>"
bookshelf['url'] = "https://#{server_name}"
bookshelf['vip'] = server_name
nginx['url'] = "https://#{server_name}"
nginx['server_name'] = server_name
lb['api_fqdn'] = server_name
lb['web_ui_fqdn'] = server_name
api_fqdn = server_name

then, as the migration user execute a reload of settings and a server restart:

sudo chef-server-ctl reconfigure ; sudo chef-server-ctl restart
Now try again to apply the changes on the client. If it doesn't work, try to kill all chef processes and start them from scratch - for some reason I needed to do this as my chef processes were not being killed..

No comments:

Post a Comment