마구잡

kubernetes kubelet 인증서 갱신 시점 본문

Kubernetes

kubernetes kubelet 인증서 갱신 시점

MAGUJOB 2024. 10. 29. 21:03
728x90

Kubernetes kubelet 인증서

쿠버네티스 인증서가 1년마다 자동 갱신된다.

인증서 자동 갱신 옵션의 위치와 갱신 시점 갱신 불가 시나리오에 대해 알아본다.

 

( 공식 사이트에서 발췌한 내용을 기반으로 작성하였으나, 명확하지 않은 부분은 경험을 토대로 작성하였습니다.

이는 정확한 정보가 아닐 수 있음을 알려드립니다. )

공식 사이트

728x90
 

Configure Certificate Rotation for the Kubelet

This page shows how to enable and configure certificate rotation for the kubelet. FEATURE STATE: Kubernetes v1.19 [stable] Before you begin Kubernetes version 1.8.0 or later is required Overview The kubelet uses certificates for authenticating to the Kuber

kubernetes.io

( 광고 클릭은 큰 힘이 됩니다! )


kubelet 인증서

kubelet 인증서는 kubernetes 1.18 버전 부터 자동으로 갱신되게 변경 되었다.

다만 이전 버전에서 업그레이드 후 옵션을 조정하지 않았다면 자동 갱신이 불가능하므로 1년 주기로 CSR Aprrove를 수동으로 진행하거나, kubelet 옵션을 조정할 필요가 있다.

 

Kubelet은 Kubernetes API에 인증하기 위해 인증서를 사용하며, 기본적으로 1년 만료 기한으로 발급되어 너무 자주 갱신하지 않도록 설정된다.

Kubernetes는 kubelet 인증서 갱신기능을 제공하여 현재 사용 중인 인증서의 만료가 가까워지면 자동으로 새로운 키와 인증서 요청을 생성하여 Kubernetes API에서 새로운 인증서를 받는다. 새로운 인증서가 유효해지면 이를 사용하여 Kubernetes API에 대한 연결을 인증한다.


인증서 로테이션 구성

kubelet이 시작되면, 부트스트랩 설정에 따라 Kubernetes API와 연결하여 초기 인증서를 사용해 인증서 서명 요청(CSR)을 전송한다. CSR의 상태는 kubectl get csr 명령어로 확인할 수 있다.

1. kubelet이 노드에서 CSR을 생성하면 처음에는 상태가 Pending으로 표시된다.
2. 요청이 특정 기준을 충족하면 컨트롤러 매니저에서 자동으로 승인되며, 상태가 Approved로 변경된다.
3. 이후, 컨트롤러 매니저는 --cluster-signing-duration 파라미터로 지정된 기간 동안 유효한 인증서를 서명하여 CSR에 첨부한다.

 

kubelet은 Kubernetes API에서 서명된 인증서를 받아 --cert-dir 플래그로 지정된 위치에 저장하고, 이를 Kubernetes API와의 연결에 사용한다. kubelet 자동 갱신 옵션은 하기 config.yaml에 지정된다.

/var/lib/kubelet/config.yaml

apiVersion: kubelet.config.k8s.io/v1beta1
... (중략)
rotateCertificates: true

 


만료 시점에 따른 자동 갱신

서명된 인증서의 만료가 다가오면 kubelet은 자동으로 새로운 CSR을 생성하여 Kubernetes API에 전송한니다. 인증서의 남은 유효 기간이 30%에서 10% 사이로 줄어들 때 언제든지 이 과정이 발생할 수 있으며, 컨트롤러 매니저가 CSR을 승인하고 서명된 인증서를 첨부하면 kubelet은 이를 받아 디스크에 저장하고 Kubernetes API와의 연결을 업데이트한다.


소스코드 확인

 

kubernetes/staging/src/k8s.io/client-go/util/certificate/certificate_manager.go at master · kubernetes/kubernetes

Production-Grade Container Scheduling and Management - kubernetes/kubernetes

github.com

 

소스코드 전문

더보기
/*
Copyright 2017 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package certificate

import (
	"context"
	"crypto/ecdsa"
	"crypto/elliptic"
	cryptorand "crypto/rand"
	"crypto/rsa"
	"crypto/tls"
	"crypto/x509"
	"encoding/pem"
	"errors"
	"fmt"
	"reflect"
	"sync"
	"time"

	"k8s.io/klog/v2"

	certificates "k8s.io/api/certificates/v1"
	apierrors "k8s.io/apimachinery/pkg/api/errors"
	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
	"k8s.io/apimachinery/pkg/util/sets"
	"k8s.io/apimachinery/pkg/util/wait"
	clientset "k8s.io/client-go/kubernetes"
	"k8s.io/client-go/util/cert"
	"k8s.io/client-go/util/certificate/csr"
	"k8s.io/client-go/util/keyutil"
)

var (
	// certificateWaitTimeout controls the amount of time we wait for certificate
	// approval in one iteration.
	certificateWaitTimeout = 15 * time.Minute

	kubeletServingUsagesWithEncipherment = []certificates.KeyUsage{
		// https://tools.ietf.org/html/rfc5280#section-4.2.1.3
		//
		// Digital signature allows the certificate to be used to verify
		// digital signatures used during TLS negotiation.
		certificates.UsageDigitalSignature,
		// KeyEncipherment allows the cert/key pair to be used to encrypt
		// keys, including the symmetric keys negotiated during TLS setup
		// and used for data transfer.
		certificates.UsageKeyEncipherment,
		// ServerAuth allows the cert to be used by a TLS server to
		// authenticate itself to a TLS client.
		certificates.UsageServerAuth,
	}
	kubeletServingUsagesNoEncipherment = []certificates.KeyUsage{
		// https://tools.ietf.org/html/rfc5280#section-4.2.1.3
		//
		// Digital signature allows the certificate to be used to verify
		// digital signatures used during TLS negotiation.
		certificates.UsageDigitalSignature,
		// ServerAuth allows the cert to be used by a TLS server to
		// authenticate itself to a TLS client.
		certificates.UsageServerAuth,
	}
	DefaultKubeletServingGetUsages = func(privateKey interface{}) []certificates.KeyUsage {
		switch privateKey.(type) {
		case *rsa.PrivateKey:
			return kubeletServingUsagesWithEncipherment
		default:
			return kubeletServingUsagesNoEncipherment
		}
	}
	kubeletClientUsagesWithEncipherment = []certificates.KeyUsage{
		// https://tools.ietf.org/html/rfc5280#section-4.2.1.3
		//
		// Digital signature allows the certificate to be used to verify
		// digital signatures used during TLS negotiation.
		certificates.UsageDigitalSignature,
		// KeyEncipherment allows the cert/key pair to be used to encrypt
		// keys, including the symmetric keys negotiated during TLS setup
		// and used for data transfer.
		certificates.UsageKeyEncipherment,
		// ClientAuth allows the cert to be used by a TLS client to
		// authenticate itself to the TLS server.
		certificates.UsageClientAuth,
	}
	kubeletClientUsagesNoEncipherment = []certificates.KeyUsage{
		// https://tools.ietf.org/html/rfc5280#section-4.2.1.3
		//
		// Digital signature allows the certificate to be used to verify
		// digital signatures used during TLS negotiation.
		certificates.UsageDigitalSignature,
		// ClientAuth allows the cert to be used by a TLS client to
		// authenticate itself to the TLS server.
		certificates.UsageClientAuth,
	}
	DefaultKubeletClientGetUsages = func(privateKey interface{}) []certificates.KeyUsage {
		switch privateKey.(type) {
		case *rsa.PrivateKey:
			return kubeletClientUsagesWithEncipherment
		default:
			return kubeletClientUsagesNoEncipherment
		}
	}
)

// Manager maintains and updates the certificates in use by this certificate
// manager. In the background it communicates with the API server to get new
// certificates for certificates about to expire.
type Manager interface {
	// Start the API server status sync loop.
	Start()
	// Stop the cert manager loop.
	Stop()
	// Current returns the currently selected certificate from the
	// certificate manager, as well as the associated certificate and key data
	// in PEM format.
	Current() *tls.Certificate
	// ServerHealthy returns true if the manager is able to communicate with
	// the server. This allows a caller to determine whether the cert manager
	// thinks it can potentially talk to the API server. The cert manager may
	// be very conservative and only return true if recent communication has
	// occurred with the server.
	ServerHealthy() bool
}

// Config is the set of configuration parameters available for a new Manager.
type Config struct {
	// ClientsetFn will be used to create a clientset for
	// creating/fetching new certificate requests generated when a key rotation occurs.
	// The function will never be invoked in parallel.
	// It is passed the current client certificate if one exists.
	ClientsetFn ClientsetFunc
	// Template is the CertificateRequest that will be used as a template for
	// generating certificate signing requests for all new keys generated as
	// part of rotation. It follows the same rules as the template parameter of
	// crypto.x509.CreateCertificateRequest in the Go standard libraries.
	Template *x509.CertificateRequest
	// GetTemplate returns the CertificateRequest that will be used as a template for
	// generating certificate signing requests for all new keys generated as
	// part of rotation. It follows the same rules as the template parameter of
	// crypto.x509.CreateCertificateRequest in the Go standard libraries.
	// If no template is available, nil may be returned, and no certificate will be requested.
	// If specified, takes precedence over Template.
	GetTemplate func() *x509.CertificateRequest
	// SignerName is the name of the certificate signer that should sign certificates
	// generated by the manager.
	SignerName string
	// RequestedCertificateLifetime is the requested lifetime length for certificates generated by the manager.
	// Optional.
	// This will set the spec.expirationSeconds field on the CSR.  Controlling the lifetime of
	// the issued certificate is not guaranteed as the signer may choose to ignore the request.
	RequestedCertificateLifetime *time.Duration
	// Usages is the types of usages that certificates generated by the manager
	// can be used for. It is mutually exclusive with GetUsages.
	Usages []certificates.KeyUsage
	// GetUsages is dynamic way to get the types of usages that certificates generated by the manager
	// can be used for. If Usages is not nil, GetUsages has to be nil, vice versa.
	// It is mutually exclusive with Usages.
	GetUsages func(privateKey interface{}) []certificates.KeyUsage
	// CertificateStore is a persistent store where the current cert/key is
	// kept and future cert/key pairs will be persisted after they are
	// generated.
	CertificateStore Store
	// BootstrapCertificatePEM is the certificate data that will be returned
	// from the Manager if the CertificateStore doesn't have any cert/key pairs
	// currently available and has not yet had a chance to get a new cert/key
	// pair from the API. If the CertificateStore does have a cert/key pair,
	// this will be ignored. If there is no cert/key pair available in the
	// CertificateStore, as soon as Start is called, it will request a new
	// cert/key pair from the CertificateSigningRequestClient. This is intended
	// to allow the first boot of a component to be initialized using a
	// generic, multi-use cert/key pair which will be quickly replaced with a
	// unique cert/key pair.
	BootstrapCertificatePEM []byte
	// BootstrapKeyPEM is the key data that will be returned from the Manager
	// if the CertificateStore doesn't have any cert/key pairs currently
	// available. If the CertificateStore does have a cert/key pair, this will
	// be ignored. If the bootstrap cert/key pair are used, they will be
	// rotated at the first opportunity, possibly well in advance of expiring.
	// This is intended to allow the first boot of a component to be
	// initialized using a generic, multi-use cert/key pair which will be
	// quickly replaced with a unique cert/key pair.
	BootstrapKeyPEM []byte `datapolicy:"security-key"`
	// CertificateRotation will record a metric showing the time in seconds
	// that certificates lived before being rotated. This metric is a histogram
	// because there is value in keeping a history of rotation cadences. It
	// allows one to setup monitoring and alerting of unexpected rotation
	// behavior and track trends in rotation frequency.
	CertificateRotation Histogram
	// CertifcateRenewFailure will record a metric that keeps track of
	// certificate renewal failures.
	CertificateRenewFailure Counter
	// Name is an optional string that will be used when writing log output
	// or returning errors from manager methods. If not set, SignerName will
	// be used, if SignerName is not set, if Usages includes client auth the
	// name will be "client auth", otherwise the value will be "server".
	Name string
	// Logf is an optional function that log output will be sent to from the
	// certificate manager. If not set it will use klog.V(2)
	Logf func(format string, args ...interface{})
}

// Store is responsible for getting and updating the current certificate.
// Depending on the concrete implementation, the backing store for this
// behavior may vary.
type Store interface {
	// Current returns the currently selected certificate, as well as the
	// associated certificate and key data in PEM format. If the Store doesn't
	// have a cert/key pair currently, it should return a NoCertKeyError so
	// that the Manager can recover by using bootstrap certificates to request
	// a new cert/key pair.
	Current() (*tls.Certificate, error)
	// Update accepts the PEM data for the cert/key pair and makes the new
	// cert/key pair the 'current' pair, that will be returned by future calls
	// to Current().
	Update(cert, key []byte) (*tls.Certificate, error)
}

// Gauge will record the remaining lifetime of the certificate each time it is
// updated.
type Gauge interface {
	Set(float64)
}

// Histogram will record the time a rotated certificate was used before being
// rotated.
type Histogram interface {
	Observe(float64)
}

// Counter will wrap a counter with labels
type Counter interface {
	Inc()
}

// NoCertKeyError indicates there is no cert/key currently available.
type NoCertKeyError string

// ClientsetFunc returns a new clientset for discovering CSR API availability and requesting CSRs.
// It is passed the current certificate if one is available and valid.
type ClientsetFunc func(current *tls.Certificate) (clientset.Interface, error)

func (e *NoCertKeyError) Error() string { return string(*e) }

type manager struct {
	getTemplate func() *x509.CertificateRequest

	// lastRequestLock guards lastRequestCancel and lastRequest
	lastRequestLock   sync.Mutex
	lastRequestCancel context.CancelFunc
	lastRequest       *x509.CertificateRequest

	dynamicTemplate              bool
	signerName                   string
	requestedCertificateLifetime *time.Duration
	getUsages                    func(privateKey interface{}) []certificates.KeyUsage
	forceRotation                bool

	certStore Store

	certificateRotation     Histogram
	certificateRenewFailure Counter

	// the following variables must only be accessed under certAccessLock
	certAccessLock sync.RWMutex
	cert           *tls.Certificate
	serverHealth   bool

	// the clientFn must only be accessed under the clientAccessLock
	clientAccessLock sync.Mutex
	clientsetFn      ClientsetFunc
	stopCh           chan struct{}
	stopped          bool

	// Set to time.Now but can be stubbed out for testing
	now func() time.Time

	name string
	logf func(format string, args ...interface{})
}

// NewManager returns a new certificate manager. A certificate manager is
// responsible for being the authoritative source of certificates in the
// Kubelet and handling updates due to rotation.
func NewManager(config *Config) (Manager, error) {
	cert, forceRotation, err := getCurrentCertificateOrBootstrap(
		config.CertificateStore,
		config.BootstrapCertificatePEM,
		config.BootstrapKeyPEM)
	if err != nil {
		return nil, err
	}

	getTemplate := config.GetTemplate
	if getTemplate == nil {
		getTemplate = func() *x509.CertificateRequest { return config.Template }
	}

	if config.GetUsages != nil && config.Usages != nil {
		return nil, errors.New("cannot specify both GetUsages and Usages")
	}
	if config.GetUsages == nil && config.Usages == nil {
		return nil, errors.New("either GetUsages or Usages should be specified")
	}
	var getUsages func(interface{}) []certificates.KeyUsage
	if config.GetUsages != nil {
		getUsages = config.GetUsages
	} else {
		getUsages = func(interface{}) []certificates.KeyUsage { return config.Usages }
	}
	m := manager{
		stopCh:                       make(chan struct{}),
		clientsetFn:                  config.ClientsetFn,
		getTemplate:                  getTemplate,
		dynamicTemplate:              config.GetTemplate != nil,
		signerName:                   config.SignerName,
		requestedCertificateLifetime: config.RequestedCertificateLifetime,
		getUsages:                    getUsages,
		certStore:                    config.CertificateStore,
		cert:                         cert,
		forceRotation:                forceRotation,
		certificateRotation:          config.CertificateRotation,
		certificateRenewFailure:      config.CertificateRenewFailure,
		now:                          time.Now,
	}

	name := config.Name
	if len(name) == 0 {
		name = m.signerName
	}
	if len(name) == 0 {
		usages := getUsages(nil)
		switch {
		case hasKeyUsage(usages, certificates.UsageClientAuth):
			name = string(certificates.UsageClientAuth)
		default:
			name = "certificate"
		}
	}

	m.name = name
	m.logf = config.Logf
	if m.logf == nil {
		m.logf = func(format string, args ...interface{}) { klog.V(2).Infof(format, args...) }
	}

	return &m, nil
}

// Current returns the currently selected certificate from the certificate
// manager. This can be nil if the manager was initialized without a
// certificate and has not yet received one from the
// CertificateSigningRequestClient, or if the current cert has expired.
func (m *manager) Current() *tls.Certificate {
	m.certAccessLock.RLock()
	defer m.certAccessLock.RUnlock()
	if m.cert != nil && m.cert.Leaf != nil && m.now().After(m.cert.Leaf.NotAfter) {
		m.logf("%s: Current certificate is expired", m.name)
		return nil
	}
	return m.cert
}

// ServerHealthy returns true if the cert manager believes the server
// is currently alive.
func (m *manager) ServerHealthy() bool {
	m.certAccessLock.RLock()
	defer m.certAccessLock.RUnlock()
	return m.serverHealth
}

// Stop terminates the manager.
func (m *manager) Stop() {
	m.clientAccessLock.Lock()
	defer m.clientAccessLock.Unlock()
	if m.stopped {
		return
	}
	close(m.stopCh)
	m.stopped = true
}

// Start will start the background work of rotating the certificates.
func (m *manager) Start() {
	// Certificate rotation depends on access to the API server certificate
	// signing API, so don't start the certificate manager if we don't have a
	// client.
	if m.clientsetFn == nil {
		m.logf("%s: Certificate rotation is not enabled, no connection to the apiserver", m.name)
		return
	}
	m.logf("%s: Certificate rotation is enabled", m.name)

	templateChanged := make(chan struct{})
	go wait.Until(func() {
		deadline := m.nextRotationDeadline()
		if sleepInterval := deadline.Sub(m.now()); sleepInterval > 0 {
			m.logf("%s: Waiting %v for next certificate rotation", m.name, sleepInterval)

			timer := time.NewTimer(sleepInterval)
			defer timer.Stop()

			select {
			case <-timer.C:
				// unblock when deadline expires
			case <-templateChanged:
				_, lastRequestTemplate := m.getLastRequest()
				if reflect.DeepEqual(lastRequestTemplate, m.getTemplate()) {
					// if the template now matches what we last requested, restart the rotation deadline loop
					return
				}
				m.logf("%s: Certificate template changed, rotating", m.name)
			}
		}

		// Don't enter rotateCerts and trigger backoff if we don't even have a template to request yet
		if m.getTemplate() == nil {
			return
		}

		backoff := wait.Backoff{
			Duration: 2 * time.Second,
			Factor:   2,
			Jitter:   0.1,
			Steps:    5,
		}
		if err := wait.ExponentialBackoff(backoff, m.rotateCerts); err != nil {
			utilruntime.HandleError(fmt.Errorf("%s: Reached backoff limit, still unable to rotate certs: %v", m.name, err))
			wait.PollInfinite(32*time.Second, m.rotateCerts)
		}
	}, time.Second, m.stopCh)

	if m.dynamicTemplate {
		go wait.Until(func() {
			// check if the current template matches what we last requested
			lastRequestCancel, lastRequestTemplate := m.getLastRequest()

			if !m.certSatisfiesTemplate() && !reflect.DeepEqual(lastRequestTemplate, m.getTemplate()) {
				// if the template is different, queue up an interrupt of the rotation deadline loop.
				// if we've requested a CSR that matches the new template by the time the interrupt is handled, the interrupt is disregarded.
				if lastRequestCancel != nil {
					// if we're currently waiting on a submitted request that no longer matches what we want, stop waiting
					lastRequestCancel()
				}
				select {
				case templateChanged <- struct{}{}:
				case <-m.stopCh:
				}
			}
		}, time.Second, m.stopCh)
	}
}

func getCurrentCertificateOrBootstrap(
	store Store,
	bootstrapCertificatePEM []byte,
	bootstrapKeyPEM []byte) (cert *tls.Certificate, shouldRotate bool, errResult error) {

	currentCert, err := store.Current()
	if err == nil {
		// if the current cert is expired, fall back to the bootstrap cert
		if currentCert.Leaf != nil && time.Now().Before(currentCert.Leaf.NotAfter) {
			return currentCert, false, nil
		}
	} else {
		if _, ok := err.(*NoCertKeyError); !ok {
			return nil, false, err
		}
	}

	if bootstrapCertificatePEM == nil || bootstrapKeyPEM == nil {
		return nil, true, nil
	}

	bootstrapCert, err := tls.X509KeyPair(bootstrapCertificatePEM, bootstrapKeyPEM)
	if err != nil {
		return nil, false, err
	}
	if len(bootstrapCert.Certificate) < 1 {
		return nil, false, fmt.Errorf("no cert/key data found")
	}

	certs, err := x509.ParseCertificates(bootstrapCert.Certificate[0])
	if err != nil {
		return nil, false, fmt.Errorf("unable to parse certificate data: %v", err)
	}
	if len(certs) < 1 {
		return nil, false, fmt.Errorf("no cert data found")
	}
	bootstrapCert.Leaf = certs[0]

	if _, err := store.Update(bootstrapCertificatePEM, bootstrapKeyPEM); err != nil {
		utilruntime.HandleError(fmt.Errorf("unable to set the cert/key pair to the bootstrap certificate: %v", err))
	}

	return &bootstrapCert, true, nil
}

func (m *manager) getClientset() (clientset.Interface, error) {
	current := m.Current()
	m.clientAccessLock.Lock()
	defer m.clientAccessLock.Unlock()
	return m.clientsetFn(current)
}

// RotateCerts is exposed for testing only and is not a part of the public interface.
// Returns true if it changed the cert, false otherwise. Error is only returned in
// exceptional cases.
func (m *manager) RotateCerts() (bool, error) {
	return m.rotateCerts()
}

// rotateCerts attempts to request a client cert from the server, wait a reasonable
// period of time for it to be signed, and then update the cert on disk. If it cannot
// retrieve a cert, it will return false. It will only return error in exceptional cases.
// This method also keeps track of "server health" by interpreting the responses it gets
// from the server on the various calls it makes.
// TODO: return errors, have callers handle and log them correctly
func (m *manager) rotateCerts() (bool, error) {
	m.logf("%s: Rotating certificates", m.name)

	template, csrPEM, keyPEM, privateKey, err := m.generateCSR()
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("%s: Unable to generate a certificate signing request: %v", m.name, err))
		if m.certificateRenewFailure != nil {
			m.certificateRenewFailure.Inc()
		}
		return false, nil
	}

	// request the client each time
	clientSet, err := m.getClientset()
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("%s: Unable to load a client to request certificates: %v", m.name, err))
		if m.certificateRenewFailure != nil {
			m.certificateRenewFailure.Inc()
		}
		return false, nil
	}

	getUsages := m.getUsages
	if m.getUsages == nil {
		getUsages = DefaultKubeletClientGetUsages
	}
	usages := getUsages(privateKey)
	// Call the Certificate Signing Request API to get a certificate for the
	// new private key
	reqName, reqUID, err := csr.RequestCertificate(clientSet, csrPEM, "", m.signerName, m.requestedCertificateLifetime, usages, privateKey)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("%s: Failed while requesting a signed certificate from the control plane: %v", m.name, err))
		if m.certificateRenewFailure != nil {
			m.certificateRenewFailure.Inc()
		}
		return false, m.updateServerError(err)
	}

	ctx, cancel := context.WithTimeout(context.Background(), certificateWaitTimeout)
	defer cancel()

	// Once we've successfully submitted a CSR for this template, record that we did so
	m.setLastRequest(cancel, template)

	// Wait for the certificate to be signed. This interface and internal timout
	// is a remainder after the old design using raw watch wrapped with backoff.
	crtPEM, err := csr.WaitForCertificate(ctx, clientSet, reqName, reqUID)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("%s: certificate request was not signed: %v", m.name, err))
		if m.certificateRenewFailure != nil {
			m.certificateRenewFailure.Inc()
		}
		return false, nil
	}

	cert, err := m.certStore.Update(crtPEM, keyPEM)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("%s: Unable to store the new cert/key pair: %v", m.name, err))
		if m.certificateRenewFailure != nil {
			m.certificateRenewFailure.Inc()
		}
		return false, nil
	}

	if old := m.updateCached(cert); old != nil && m.certificateRotation != nil {
		m.certificateRotation.Observe(m.now().Sub(old.Leaf.NotBefore).Seconds())
	}

	return true, nil
}

// Check that the current certificate on disk satisfies the requests from the
// current template.
//
// Note that extra items in the certificate's SAN or orgs that don't exist in
// the template will not trigger a renewal.
//
// Requires certAccessLock to be locked.
func (m *manager) certSatisfiesTemplateLocked() bool {
	if m.cert == nil {
		return false
	}

	if template := m.getTemplate(); template != nil {
		if template.Subject.CommonName != m.cert.Leaf.Subject.CommonName {
			m.logf("%s: Current certificate CN (%s) does not match requested CN (%s)", m.name, m.cert.Leaf.Subject.CommonName, template.Subject.CommonName)
			return false
		}

		currentDNSNames := sets.NewString(m.cert.Leaf.DNSNames...)
		desiredDNSNames := sets.NewString(template.DNSNames...)
		missingDNSNames := desiredDNSNames.Difference(currentDNSNames)
		if len(missingDNSNames) > 0 {
			m.logf("%s: Current certificate is missing requested DNS names %v", m.name, missingDNSNames.List())
			return false
		}

		currentIPs := sets.NewString()
		for _, ip := range m.cert.Leaf.IPAddresses {
			currentIPs.Insert(ip.String())
		}
		desiredIPs := sets.NewString()
		for _, ip := range template.IPAddresses {
			desiredIPs.Insert(ip.String())
		}
		missingIPs := desiredIPs.Difference(currentIPs)
		if len(missingIPs) > 0 {
			m.logf("%s: Current certificate is missing requested IP addresses %v", m.name, missingIPs.List())
			return false
		}

		currentOrgs := sets.NewString(m.cert.Leaf.Subject.Organization...)
		desiredOrgs := sets.NewString(template.Subject.Organization...)
		missingOrgs := desiredOrgs.Difference(currentOrgs)
		if len(missingOrgs) > 0 {
			m.logf("%s: Current certificate is missing requested orgs %v", m.name, missingOrgs.List())
			return false
		}
	}

	return true
}

func (m *manager) certSatisfiesTemplate() bool {
	m.certAccessLock.RLock()
	defer m.certAccessLock.RUnlock()
	return m.certSatisfiesTemplateLocked()
}

// nextRotationDeadline returns a value for the threshold at which the
// current certificate should be rotated, 80%+/-10% of the expiration of the
// certificate.
func (m *manager) nextRotationDeadline() time.Time {
	// forceRotation is not protected by locks
	if m.forceRotation {
		m.forceRotation = false
		return m.now()
	}

	m.certAccessLock.RLock()
	defer m.certAccessLock.RUnlock()

	if !m.certSatisfiesTemplateLocked() {
		return m.now()
	}

	notAfter := m.cert.Leaf.NotAfter
	totalDuration := float64(notAfter.Sub(m.cert.Leaf.NotBefore))
	deadline := m.cert.Leaf.NotBefore.Add(jitteryDuration(totalDuration))

	m.logf("%s: Certificate expiration is %v, rotation deadline is %v", m.name, notAfter, deadline)
	return deadline
}

// jitteryDuration uses some jitter to set the rotation threshold so each node
// will rotate at approximately 70-90% of the total lifetime of the
// certificate.  With jitter, if a number of nodes are added to a cluster at
// approximately the same time (such as cluster creation time), they won't all
// try to rotate certificates at the same time for the rest of the life of the
// cluster.
//
// This function is represented as a variable to allow replacement during testing.
var jitteryDuration = func(totalDuration float64) time.Duration {
	return wait.Jitter(time.Duration(totalDuration), 0.2) - time.Duration(totalDuration*0.3)
}

// updateCached sets the most recent retrieved cert and returns the old cert.
// It also sets the server as assumed healthy.
func (m *manager) updateCached(cert *tls.Certificate) *tls.Certificate {
	m.certAccessLock.Lock()
	defer m.certAccessLock.Unlock()
	m.serverHealth = true
	old := m.cert
	m.cert = cert
	return old
}

// updateServerError takes an error returned by the server and infers
// the health of the server based on the error. It will return nil if
// the error does not require immediate termination of any wait loops,
// and otherwise it will return the error.
func (m *manager) updateServerError(err error) error {
	m.certAccessLock.Lock()
	defer m.certAccessLock.Unlock()
	switch {
	case apierrors.IsUnauthorized(err):
		// SSL terminating proxies may report this error instead of the master
		m.serverHealth = true
	case apierrors.IsUnexpectedServerError(err):
		// generally indicates a proxy or other load balancer problem, rather than a problem coming
		// from the master
		m.serverHealth = false
	default:
		// Identify known errors that could be expected for a cert request that
		// indicate everything is working normally
		m.serverHealth = apierrors.IsNotFound(err) || apierrors.IsForbidden(err)
	}
	return nil
}

func (m *manager) generateCSR() (template *x509.CertificateRequest, csrPEM []byte, keyPEM []byte, key interface{}, err error) {
	// Generate a new private key.
	privateKey, err := ecdsa.GenerateKey(elliptic.P256(), cryptorand.Reader)
	if err != nil {
		return nil, nil, nil, nil, fmt.Errorf("%s: unable to generate a new private key: %v", m.name, err)
	}
	der, err := x509.MarshalECPrivateKey(privateKey)
	if err != nil {
		return nil, nil, nil, nil, fmt.Errorf("%s: unable to marshal the new key to DER: %v", m.name, err)
	}

	keyPEM = pem.EncodeToMemory(&pem.Block{Type: keyutil.ECPrivateKeyBlockType, Bytes: der})

	template = m.getTemplate()
	if template == nil {
		return nil, nil, nil, nil, fmt.Errorf("%s: unable to create a csr, no template available", m.name)
	}
	csrPEM, err = cert.MakeCSRFromTemplate(privateKey, template)
	if err != nil {
		return nil, nil, nil, nil, fmt.Errorf("%s: unable to create a csr from the private key: %v", m.name, err)
	}
	return template, csrPEM, keyPEM, privateKey, nil
}

func (m *manager) getLastRequest() (context.CancelFunc, *x509.CertificateRequest) {
	m.lastRequestLock.Lock()
	defer m.lastRequestLock.Unlock()
	return m.lastRequestCancel, m.lastRequest
}

func (m *manager) setLastRequest(cancel context.CancelFunc, r *x509.CertificateRequest) {
	m.lastRequestLock.Lock()
	defer m.lastRequestLock.Unlock()
	m.lastRequestCancel = cancel
	m.lastRequest = r
}

func hasKeyUsage(usages []certificates.KeyUsage, usage certificates.KeyUsage) bool {
	for _, u := range usages {
		if u == usage {
			return true
		}
	}
	return false
}

 

nextRotationDeadline 함수는 Kubernetes 시스템에서 인증서의 자동 갱신 시점을 결정하는 함수다.

 

이 함수는 현재 인증서의 만료 시간과 전체 유효 기간을 바탕으로 최적의 갱신 시간을 계산한다. 이로 인해 노드들이 일제히 갱신하지 않고, 약간의 시간 차이를 두고 갱신하게 되어 시스템 안정성이 높아진다고 한다.

다만 시간차이를 두는게 어떤 측면에서 안정성이 높아지는지는 아직 잘 모르겠다.

 

nextRotationDeadline 함수 설명

1. 강제 갱신 여부 확인 (라인 665-667)

if m.forceRotation {

    m.forceRotation = false

    return m.now()

}

 

이 부분에서는 forceRotation 변수가 true인지 확인한다. 만약 true라면, 즉시 인증서를 갱신해야 하는 상황임을 나타낸다.

이 경우, forceRotation 값을 false로 변경하고 현재 시간을 반환하여 즉각적인 갱신이 이루어지도록 한다. 이렇게 설정함으로써,

필요 시 즉시 갱신할 수 있도록 한다.

 

2. 인증서 접근 잠금 설정 (라인 670-671)

m.certAccessLock.RLock()

defer m.certAccessLock.RUnlock()

 

여기서는 certAccessLock이라는 잠금을 설정하여 인증서에 대한 동시 접근을 방지한다. 이 잠금을 통해 인증서를 읽는 동안 다른 작업이 접근하지 못하도록 하여 안전한 읽기 작업이 이루어지도록 한다.

 

3. 템플릿 충족 여부 확인 (라인 673-674)

if !m.certSatisfiesTemplateLocked() {

    return m.now()

}

 

m.certSatisfiesTemplateLocked()는 현재 인증서가 미리 정해진 템플릿 조건을 충족하는지 확인한다. 만약 조건을 충족하지 못하면,

현재 시각을 반환하여 즉시 갱신을 트리거한다. 이로 인해 템플릿 조건에 부합하지 않는 인증서는 곧바로 갱신이 이루어지도록 한다.

 

4. 인증서의 전체 유효 기간 계산 (라인 677-678)

notAfter := m.cert.Leaf.NotAfter

totalDuration := float64(notAfter.Sub(m.cert.Leaf.NotBefore))

 

이 단계에서는 NotAfter(인증서 만료 시간)와 NotBefore(인증서 발급 시간)의 차이를 계산하여 인증서의 전체 유효 기간을 구한다.

이 유효 기간(totalDuration)은 이후 갱신 시점을 결정하는 데 사용된다.

 

5. jitteryDuration 함수로 임의의 갱신 시점 설정 (라인 679)

deadline := m.cert.Leaf.NotBefore.Add(jitteryDuration(totalDuration))

 

여기서는 jitteryDuration(totalDuration) 함수를 호출하여 유효 기간 내에서 무작위적인 갱신 시점을 설정한다. 이 함수는 전체 유효 기간의 70%에서 90% 사이의 무작위 시점을 선택하도록 설계되어 있어, 노드들이 약간씩 다른 시점에 갱신되도록 한다.

 

6. 로그 기록 (라인 681)

m.logf("%s: Certificate expiration is %v, rotation deadline is %v", m.name, notAfter, deadline)

 

여기서는 인증서 만료 시간(notAfter)과 갱신 시점(deadline)을 로그로 기록하여 디버깅 및 모니터링에 도움이 되도록 한다. 이로 인해, 문제가 발생할 경우 언제 인증서가 만료되고 갱신이 예정되었는지 확인할 수 있다.

 

7. jitteryDuration 함수의 역할 (라인 693-695)

var jitteryDuration = func(totalDuration float64) time.Duration {

    return wait.Jitter(time.Duration(totalDuration), 0.2) - time.Duration(totalDuration*0.3)

}

 

jitteryDuration 함수는 무작위 시간(70%에서 90%)을 생성하여 갱신 시점을 설정한다. 이 함수는 wait.Jitter 메서드를 사용해 전체 기간의 ±20% 무작위 값을 설정한 후, 전체 유효 기간의 30%를 추가로 줄인다. 이로 인해 각 노드는 인증서 유효 기간의 70~90% 사이에서 갱신을 시도하게 되어, 동시에 많은 노드가 갱신하는 것을 방지하고 시스템 부하를 줄인다.

 

이와 같이 nextRotationDeadline 함수와 jitteryDuration 함수는 Kubernetes의 인증서 갱신을 최적화하여 시스템 안정성을 높이고, 무작위 갱신을 통해 과도한 부하를 예방하는 역할을 한다.


 

잘못된 정보나, 문의등은 댓글로 메일과 함께 적어주시면 감사하겠습니다.

728x90