Zero-Trust: What it is and what it means for your IBM i security implementation.
As is often the case with any technology, the security industry will start to use a term that sounds cool, but when you ask someone what it means, few people can actually articulate an accurate definition. “Zero-trust” is one of those terms. If you search “zero-trust,” you’ll see a list of sponsored results from a plethora of security vendors, all claiming to have some unique definition that (obviously) at least includes, if is not solely solved by, that company’s security solution. In this article, I hope to demystify zero-trust and explain the considerations you’ll want to make when implementing it on IBM i.
The concept of zero-trust was first introduced in 2004 and then made popular by Forrester researcher John Kindervag. It started because the most popular architecture—that of a castle and moat—wasn’t sufficient. With this architecture, once someone made it to the other side of the moat and through the castle walls, they had the run of the castle. Bringing this example into the 21st century, many organizations (still) configure their network so that, once authenticated, no (or very little) further authentication is required for access within the internal network. And “internal network” often equates to a physical office building. To me, the pandemic really drove home the limitations of this architecture (that is, the dependance of access based on where you are) because, with all of us working from home, the perimeters defining who had access and what could be accessed needed to be expanded, greatly increasing the vulnerabilities associated with this model. The castle-and-moat model doesn’t take into consideration stolen credentials that let predators in that first door and then allows them to move around unchallenged (this was the major issue with the Solar Winds breach). It also offers little protection against malicious insiders and/or accidental errors (for example, a bad firewall rule change that opens the organization to the Internet).
Taking the posture of zero-trust means that you reauthenticate at every entry point and don’t assume that anyone should have access to anything. To me, zero-trust is a mashup of the concepts of deny by default, least privilege access, and defense in depth in conjunction with regular verification that your policies (user profile capabilities and group membership, global security settings, access control settings, etc.) are regularly reviewed and verified and not assumed to be correct or still required. Zero-trust is more about an architecture and an organization-wide attitude than it is about a single solution. In fact, if a vendor asserts that they can be your sole zero-trust solution provider, you should walk away. To be successful, a zero-trust posture must be implemented throughout the organization, and no single vendor can do that.
The next sections examine what you’ll want to consider in a zero-trust architecture for IBM i. But understand that organizations are going to vary in the degree to which the zero-trust architecture is implemented. I’m asserting all of the areas you’ll want to consider, but how many of these considerations you implement will depend on your organization’s policy.
System Value and Authentication Considerations
To implement zero-trust on IBM i, you’ll generally want to implement security best practices for your IBM i security-related system values, but there are a few that are critical to this architecture:
Obviously, QSECURITY must be set to at least level 40 if not 50. Level 20 creates all profiles with *ALLOBJ by default, which obviously overrides any effort to implement a deny-by-default architecture. Security level 30 is a risk to this architecture because you can use a job description (*JOBD) that names a user profile (*USRPRF) by having authority only to the *JOBD but not to the profile named. This allows you to elevate your privileges simply by submitting a job using a *JOBD that specifies a powerful profile.
Organizations should require strong passwords (and emphasize to users that they should not reuse passwords, especially from their personal lives). To implement strong passwords, the system must be running at least password level 2 but preferably 3 or, at IBM i 7.5, level 4. Implementing strong password composition rules requires the use of the QPWDRULES system value, which allows more options than the individual password system values. QPWDRULES should include the values of *LMTPRFNAME to ensure the password cannot be a default password and *ALLCRTCHG to ensure that password composition rules are enforced even when running the Create or Change User Profile (CRT/CHGUSRPRF) commands.
Obviously, default passwords cannot be allowed in a zero-trust implementation. What may not be as obvious is that no passwords can be shared, not even QSECOFR. If multiple people know a password, how can one determine who has used it? To address this issue for shared profiles such as QSECOFR, I see many organizations using a password vault that changes the password after a one-time use.
The password expiration interval cannot be *NOMAX, which allows the password to never expire. This should be a “duh” moment, but you’d be surprised at how many systems I’ve seen set to have non-expiring passwords. Some may challenge this assertion if they’ve implemented Kerberos (a single sign-on solution) because passwords for users can be set to *NONE when using Kerberos. But some user profiles will need to retain a password (such as QSECOFR and system administrators), so the assertion still holds. In addition, the password expiration interval can be overridden in a user’s profile. This value also cannot be set to *NOMAX, which is abused much more often than the QPWDEXPITV system value. Note that there are times when it is more reasonable to have a non-expiring password—namely, for service accounts—when an expiring password would disrupt automated processes. That said, if you are truly implementing a pure zero-trust architecture, you’d leave those profiles with an expiring password and change them regularly.
A small (but not unlimited, that is, not *NOMAX) number of sign-on attempts needs to be defined. Three to five attempts is typical. As of IBM i 7.5, this value can be overridden in the user profile. But once again, should never be set to *NOMAX.
This system value defines the action the system will take once the maximum number of attempts has been reached. The action must always include the disabling of the *USRPRF (option 2 or 3) to ensure the user is reauthenticated by a prescribed process before the profile is re-enabled. To only disable the device (*DEVD), you must have implemented named devices rather than using the system’s virtual devices (QPADEV*). Otherwise, if the virtual device is disabled, a new device is automatically created, allowing the user to keep trying to sign on.
Note: If you’ve implemented named devices, you can also use the QLMTSECOFR system value, which requires anyone with *ALLOBJ or *SERVICE special authority to be specifically authorized to the *DEVD. Using this system value with virtual devices is pointless because *ALLOBJ profiles will have to be authorized to all devices since there’s no predicting which device a user will be assigned when attempting to sign on.
This system value defines the time an IBM i session can be inactive before some action is taken (either end the job or disconnect it). Many organizations have difficulty implementing a timeout on IBM i due to the way their applications have been architected. Many applications don’t have rollback capabilities, so if the job is ended in the middle of a transaction (say someone is halfway through adding a new account and they leave to go to a meeting), it’s often a manual process to undo the first steps of the transaction so that the entire transaction can be completed successfully in the new job. Not implementing a timeout on IBM i isn’t a problem, in my opinion, as long as there’s a group policy in place that times out the entire device. Many organizations have this policy in place to protect against someone sitting down at another user’s device and doing work as the user who originally signed on to the workstation. The problem is that most organizations have set a far-too-long timeout interval; I typically see 15 to 30 minutes. That’s way too long for a device to sit logged in but unattended. The most the timeout interval should be set to is 5 minutes. Many of you are baulking at that, saying that you have users who work at their desks and are not using their workstations, so even though they’re sitting right there, they’ll have to re-authenticate. That’s true, but this is where technology can make a zero-trust architecture much less intrusive. Many Windows devices allow authentication via facial recognition. What’s the big deal if all I have to do is look at my screen to unlock my device? In that case, I’d make it an even shorter timeout. And this leads me to my next discussion point.
Multi-Factor Authentication (MFA)
Requiring authentication using multiple factors is key to a successful zero-trust architecture. And this is especially true if you’ve implemented some sort of single sign-on, either one in which passwords are synced around the network or Kerberos is in use. A Kerberos implementation allows you to provide your user ID and password once and you never again have to use them when signing in to other servers around the network, including IBM i sessions. This (obviously) flies in the face of a zero-trust architecture. Am I suggesting that you don’t use Kerberos? No. But I am suggesting that you should implement MFA. A Kerberos implementation still has the benefit of not sending user IDs and passwords around the network. But if that’s the only form of authentication, once a user has signed on to their workstation, you must assume that another person with access to that workstation can act with the authority of the original signed-on user unless another form of authentication is required. And that, obviously, is a risk that organizations should want to avoid—and must avoid if implementing zero-trust.
Least Privilege Access (User Profile Configuration)
If you’ve read my books and articles or heard me speak, you know how many times I’ve said, “Give user profiles only the capabilities they need and no more.” I continue to repeat this message because I still see user profiles provisioned with more special authorities than what is required to perform their job functions. I cannot emphasize enough the exposure this presents. Why give users the capability to perform tasks outside of their assigned job functions?
Another area to which this applies is group profile assignment. Users should not be placed in a group unless they need all of the capabilities of this group. Many admins look just at the special authorities of a group profile without thinking about the objects the group owns or is authorized to. One such over-assigned profile is QPGMR. I encourage you to run the Display User Profile command with the object owned and object authorized options (DSPUSRPRF QPGMR *OBJOWN and DSPUSRPRF QPGMR *OBJAUT). If you aren’t shocked by these lists, then I’ll be shocked! QPGMR should not be used as a group profile because many vendors ship their products with QPGMR as the owner. Thus, all members of QPGMR also own (i.e., have all authority to) those products. Likewise, IBM ships QPGMR authorized to many commands—some a developer may need access to but many they will not; however, if developers have been made a member of QPGMR, they have the authority to run all commands authorized to QPGMR.
Deny by Default (Object-Level Authorities and Other Points of Access)
Another point to consider is access via network interfaces such as FTP, DDM, etc. First and foremost, if a TCP/IP service isn’t used (or shouldn’t be used), the server shouldn’t be started. Next, consider using an exit point solution to add another layer of defense (that is, manage or shut off access to the TCP/IP services in use). Many organizations that have exit point software use it only for logging access. But I would encourage you to take the additional step of putting rules in place that shut off access unless specifically allowed. In addition, some vendors have an MFA option when using these interfaces. Finally, don’t forget about SSH. If the SSHD (SSH daemon) is started, everyone can use it. Here’s a previous article on controlling access to SSH.
Access Client Solutions (ACS) also needs to be configured with a “deny by default” attitude. If your end users only need the 5250 emulator feature of ACS, why are you sending out a full-featured deployment? This is one of my favorite exploitation points when I perform IBM i penetration testing. If object-level security hasn’t been set to “deny by default,” and I have access to all features of ACS, the data I can access and the tasks I can perform with a non-administrator profile shocks most organizations. For additional guidance on deploying ACS, see any article or presentation by Wayne Bowers from IBM or the YouTube series by fellow IBM Champion Steve Bradshaw (search for the RowtonIT channel.)
Navigator for i initially ships so that no one can access it unless they have *ALLOBJ. Depending on which profiles have been assigned *ALLOBJ, you may want to remove that access and grant access only on an individual or group profile basis. If non-administrator profiles need access, you can customize the access and allow those users, but whatever you do, leave the default access as Deny.
If you’ve encrypted data at rest, make sure that the default decryption action is to bring back a fully masked value—that is, nothing about the value is displayed. Typically, only a handful of users need access to fully decrypted data. Most encryption vendors also allow the ability to ignore *ALLOBJ when decrypting so that *ALLOBJ won’t automatically provide the fully decrypted value. I encourage you to take advantage of that feature. Here’s an article discussing the benefits of encryption.
File shares are another entry point into the system. And not only that, they also put the shared path at risk of being infected with malware, including ransomware. Until IBM i 7.5, there was no way to secure the actual file share, so you must ensure that the object-level setting on the shared object is set to deny by default to limit access to only those users needing it. Another way to protect the system is to turn off broadcasting for the NetServer. This stops the system from broadcasting the list of shares if a user goes to search the network for what’s available.
Finally, get rid of old/unused stuff. Uninstall products not in use. Uninstall past versions of products that you’ve upgraded. (I understand the practice of keeping the old version around to easily be able to revert if the upgrade fails, but then you have to discipline yourself to delete the old version when the upgrade is successful!) Archive data that may need to be retained but isn’t in active use. Purge files of old database records that are no longer needed. Purge IFS directories of old streamfiles (which often hold confidential or Personally Identifiable Information (PII) data.
Rinse and Repeat (Review Settings Regularly)
All this work on your infrastructure will be for naught if these settings aren’t reviewed regularly. Clients that didn’t implement automated processes to ensure their architecture stayed in place had settings revert to values that they didn’t want. Processes that ensure object authorities, ownership and authorization list authorities, and assignments for objects in libraries and directories are critical to ensuring your architecture stays in place. In addition, you’ll want to monitor for changes to system values, the auto-start values of TCP/IP servers, additional file shares, etc.
Ensuring your configuration stays in place isn’t sufficient. To maintain your zero-trust posture, you’ll need to periodically review who has been assigned private authorities to critical files, directories, and authorization lists as well as whether file shares are still required and whether TCP/IP services are auto-started, etc.
The other configuration that’s critical to review regularly is that of group profile assignments. I’ve seen a client almost fail an audit because the company failed to regularly review group membership. Not only is this an audit risk, but it’s a security risk as well. Sometimes users change jobs, and you get the request to add them to a new group to facilitate their new role but are requested to keep their old group profile assignment to facilitate backfill training. How often do you get the request to eventually remove the old group profile when training is complete? Never. So if you don’t have a process in place to regularly review group membership, users will remain with authorities that are now outside of their current role. I’ve seen this happen, and after several years of neglect, it’s very difficult to untangle. How often should this review occur? I recommend at least every quarter, if not more often.
It should go without saying that another user profile configuration needing regular review is that of special authorities (or “capabilities” as they’re referred to in Navigator for i) assignments. Some of my clients set an alert out of the audit journal so they are immediately made aware of a new *ALLOBJ assignment rather than wait for a regular review cycle.
See my most recent book, Mastering IBM i Security, for ideas to help you automate and maintain your security configuration.
The degree to which a zero-trust architecture is embraced and implemented throughout an organization is going to depend on the organization and type of data held. For example, a military contractor or a financial organization holds highly sensitive data and is going to implement a zero-trust architecture more strictly and thoroughly throughout their organization compared to a company that manufactures widgets. But just because they don’t have highly sensitive information, I would encourage the widget manufacturer to also embrace a zero-trust architecture because it’s the best protection against current threats. After all, it’s no less of a tragedy—money will be lost and people will be affected—if manufacturing lines are down or the business must close because the company has been attacked and the systems and data aren’t available or the intellectual property of the business has been stolen.
Get together with your CIO, CISO, and other management along with technology reps from around your organization to determine the degree to which zero-trust will be implemented. Then determine priorities; perhaps it’s a coordinated effort to implement MFA across your organization. Or an effort to clean up profile capabilities. Understand that zero-trust isn’t going to be implemented overnight. It will take time. But I would prefer to have a plan for implementation over time rather than every individual technology group (firewall, email, network connections, Windows admins, UNIX/LINUX admins, along with IBM i admins) having their own idea of a security methodology or technology that needs to be implemented and risk implementing conflicting technologies or priorities.