- Newest
- Most votes
- Most comments
Hi
If you suspect a DNS resolution issue, here are steps to help diagnose and troubleshoot it:
Ensure that your head node is using the correct DNS servers. Your /etc/resolv.conf file should contain the IP addresses of your AD DNS servers (IP_1_AD and IP_2_AD). It should look something like this:
nameserver IP_1_AD
nameserver IP_2_AD
If these entries aren't present or if other DNS servers are listed, update it to reflect the AD DNS servers.
Test DNS:
nslookup AD_DOMAIN_NAME
CheckSSSD Make sure your SSSD configuration file /etc/sssd/sssd.conf has the correct domain and DNS settings. You might want to set:
dns_discovery_domain = AD_DOMAIN_NAME
Also, ensure that the SSSD service is running:
Check for Firewalls and Security Groups:
Ensure that your security groups, NACLs, and any OS-level firewalls (e.g., ufw on Ubuntu) allow communication on DNS port 53 between the head node and your AD servers.
So I modified the /etc/resolv.conf file add the IPs of my AD, and I managed to join my AD using the "sudo realm join -U AD_DOMAIN_NAME"
A nice step solved, now we are facing problem to establish "sudo login" with some AD_USER
Is there a need to modify any other file or to restart some linux service maybe ?
Now, if you're facing issues with logging in using an AD user, here’s a step-by-step guide to ensure a smooth login process:
Check /etc/sssd/sssd.conf:
Make sure your SSSD configuration file is correctly set up. Here’s a sample configuration:
[sssd] services = nss, pam config_file_version = 2 domains = AD_DOMAIN_NAME
[domain/AD_DOMAIN_NAME] id_provider = ad auth_provider = ad access_provider = ad ldap_id_mapping = True cache_credentials = True
Optionally specify the ad_hostname if needed
#ad_hostname = headnode.example.com
override_homedir = /home/%u default_shell = /bin/bash
Restart the SSSD and Other Services:
After modifying /etc/sssd/sssd.conf, restart the following services to apply the changes:
sudo systemctl restart sssd sudo systemctl restart realmd sudo systemctl restart nscd # Name Service Cache Daemon (optional)
Modify PAM Configuration (Pluggable Authentication Modules):
Ensure that your PAM configuration files allow AD logins:
Edit /etc/pam.d/common-auth and ensure the following line is present:
auth [success=1 default=ignore] pam_sss.so use_first_pass
session required pam_mkhomedir.so skel=/etc/skel/ umask=0077
Allow SSH Access:
If you’re trying to SSH into the head node as the AD user, make sure the user is not restricted:
Edit /etc/ssh/sshd_config and ensure UsePAM yes is set.
AllowGroups domain^users
DomainReadOnlyUser: cn=ReadOnlyUser,ou=Users,ou=AWS-JIMMY,dc=CORP,dc=EXAMPLE,dc=COM ==> should we adapt the ReadOnlyUser to the name of the group we define in our AD forest ?
I tried what you proposed but without success so far.
Recipe: aws-parallelcluster-environment::finalize_directory_service
- execute[Fetch user data from remote directory service] action run[2024-09-23T06:40:14+00:00] INFO: Processing execute[Fetch user data from remote directory service] action run (aws-parallelcluster-environment::finalize_directory_service line 24) [2024-09-23T06:40:14+00:00] ERROR: execute[Fetch user data from remote directory service] (aws-parallelcluster-environment::finalize_directory_service line 24) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '2' ---- Begin output of sudo -u ubuntu getent passwd ReadOnlyUser ---- STDOUT: STDERR: ---- End output of sudo -u ubuntu getent passwd ReadOnlyUser ---- Ran sudo -u ubuntu getent passwd ReadOnlyUser returned 2; ignore_failure is set, continuing
Some error i found when digging into the /var/log/chef-client.log file
After doing all these steps, I still have the error "Permissions Denied" when trying to connect via SSH to the Cluster's instance. Here is are the error's lines in auth.log about this attempts :
Sep 23 13:50:10 aws sshd[1186]: Connection reset by authenticating user user@domain IP_src port 24058 [preauth]
Sep 23 13:50:30 aws sshd[1189]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=IP_SRC user=user@domain
Sep 23 13:50:30 aws sshd[1189]: pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=IP_src user=user@domain
Sep 23 13:50:30 aws sshd[1189]: pam_sss(sshd:auth): received for user user@domain: 4 (System error)
Sep 23 13:50:32 aws sshd[1189]: Failed password for user@domain from IP_SRC port 62867 ssh2
Sep 23 13:50:34 aws sshd[1189]: Connection reset by authenticating user user@domain IP_SRC port 62867 [preauth]
When I run the command
sudo getent passwd user
it gives this result :
user:*:1338001636:1338000513:Admin User:/home/user@domain:/bin/bash
After trying to do this command, it appears that I can connect as an AD user :
sudo su <AD_user>
But when we remove "sudo" and we enter the AD user's password, it doesn't work. How can we assure the password synchronisation between the AD and Linux ?
Information that could help solving this issue : when we have run
sudo apt install krb5-user
It does give us this result :
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
krb5-user : Depends: libkrb5-3 (= 1.17-6ubuntu4.7) but 1.19.2-2ubuntu0.4 is to be installed
E: Unable to correct problems, you have held broken packages.
Because the authentication seems not working with our cluster and krb5-user seems to manage it, do you have an answer to update it with this ?
Relevant content
- asked a year ago
- Accepted Answerasked 4 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
Our headnode OS is ubuntu
The ping to IP_1_AD and IP_2_AD are working properly.
We are guessing something wrong is going regarding the DNS resolution