By using AWS re:Post, you agree to the Terms of Use
/Monitoring/

Questions tagged with Monitoring

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Status 2/2 failed from amazon side.

Hi team, One of our servers was down yesterday with a 2/2 status failed caused by Amazone. Due to which we are also unable to log in, I have tried multiple troubleshooting steps, such as starting, stopping, rebooting, enabling details monitoring, and collecting system logs, but it appears that we are unable to recover the instance at this time. I have also tried to increase server resources for a time being, but this did not solve the problem. Please help me to recover this issue also please follow the below logs for more details ( Instance type: m5.4xlrage, with 1000GB of gp2) [ 0.000000] Linux version 5.8.0-1038-aws (buildd@lcy01-amd64-016) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #40~20.04.1-Ubuntu SMP Thu Jun 17 13:25:28 UTC 2021 (Ubuntu 5.8.0-1038.40~20.04.1-aws 5.8.18) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-1038-aws root=PARTUUID=5198cbc0-01 ro console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1 [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] AMD AuthenticAMD [ 0.000000] Hygon HygonGenuine [ 0.000000] Centaur CentaurHauls [ 0.000000] zhaoxin Shanghai [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffe8fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bffe9000-0x00000000bfffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000e03fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000ff7ffffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000ff8000000-0x000000103fffffff] reserved [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 2.7 present. [ 0.000000] DMI: Amazon EC2 m5a.4xlarge/, BIOS 1.0 10/16/2017 [ 0.000000] Hypervisor detected: KVM [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000000] kvm-clock: cpu 0, msr 124a01001, primary cpu clock [ 0.000000] kvm-clock: using sched offset of 11809202197 cycles [ 0.000003] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000005] tsc: Detected 2199.474 MHz processor [ 0.000602] last_pfn = 0xff8000 max_arch_pfn = 0x400000000 [ 0.000709] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.000736] last_pfn = 0xbffe9 max_arch_pfn = 0x400000000 [ 0.006651] check: Scanning 1 areas for low memory corruption [ 0.006703] Using GB pages for direct mapping [ 0.006927] RAMDISK: [mem 0x37715000-0x37b81fff] [ 0.006938] ACPI: Early table checksum verification disabled [ 0.006945] ACPI: RSDP 0x00000000000F8F40 000014 (v00 AMAZON) [ 0.006952] ACPI: RSDT 0x00000000BFFEDCB0 000044 (v01 AMAZON AMZNRSDT 00000001 AMZN 00000001) [ 0.006958] ACPI: FACP 0x00000000BFFEFF80 000074 (v01 AMAZON AMZNFACP 00000001 AMZN 00000001) [ 0.006964] ACPI: DSDT 0x00000000BFFEDD00 0010E9 (v01 AMAZON AMZNDSDT 00000001 AMZN 00000001) [ 0.006968] ACPI: FACS 0x00000000BFFEFF40 000040 [ 0.006971] ACPI: SSDT 0x00000000BFFEF170 000DC8 (v01 AMAZON AMZNSSDT 00000001 AMZN 00000001) [ 0.006975] ACPI: APIC 0x00000000BFFEF010 0000E6 (v01 AMAZON AMZNAPIC 00000001 AMZN 00000001) [ 0.006978] ACPI: SRAT 0x00000000BFFEEE90 000180 (v01 AMAZON AMZNSRAT 00000001 AMZN 00000001) [ 0.006981] ACPI: SLIT 0x00000000BFFEEE20 00006C (v01 AMAZON AMZNSLIT 00000001 AMZN 00000001) [ 0.006985] ACPI: WAET 0x00000000BFFEEDF0 000028 (v01 AMAZON AMZNWAET 00000001 AMZN 00000001) [ 0.006991] ACPI: HPET 0x00000000000C9000 000038 (v01 AMAZON AMZNHPET 00000001 AMZN 00000001) [ 0.006994] ACPI: SSDT 0x00000000000C9040 00007B (v01 AMAZON AMZNSSDT 00000001 AMZN 00000001) [ 0.006997] ACPI: Reserving FACP table memory at [mem 0xbffeff80-0xbffefff3] [ 0.006999] ACPI: Reserving DSDT table memory at [mem 0xbffedd00-0xbffeede8] [ 0.007000] ACPI: Reserving FACS table memory at [mem 0xbffeff40-0xbffeff7f] [ 0.007001] ACPI: Reserving SSDT table memory at [mem 0xbffef170-0xbffeff37] [ 0.007002] ACPI: Reserving APIC table memory at [mem 0xbffef010-0xbffef0f5] [ 0.007003] ACPI: Reserving SRAT table memory at [mem 0xbffeee90-0xbffef00f] [ 0.007004] ACPI: Reserving SLIT table memory at [mem 0xbffeee20-0xbffeee8b] [ 0.007005] ACPI: Reserving WAET table memory at [mem 0xbffeedf0-0xbffeee17] [ 0.007007] ACPI: Reserving HPET table memory at [mem 0xc9000-0xc9037] [ 0.007008] ACPI: Reserving SSDT table memory at [mem 0xc9040-0xc90ba] [ 0.007080] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.007082] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.007083] SRAT: PXM 0 -> APIC 0x02 -> Node 0 [ 0.007084] SRAT: PXM 0 -> APIC 0x03 -> Node 0 [ 0.007085] SRAT: PXM 0 -> APIC 0x04 -> Node 0 [ 0.007086] SRAT: PXM 0 -> APIC 0x05 -> Node 0 [ 0.007087] SRAT: PXM 0 -> APIC 0x06 -> Node 0 [ 0.007088] SRAT: PXM 0 -> APIC 0x07 -> Node 0 [ 0.007088] SRAT: PXM 0 -> APIC 0x08 -> Node 0 [ 0.007089] SRAT: PXM 0 -> APIC 0x09 -> Node 0 [ 0.007090] SRAT: PXM 0 -> APIC 0x0a -> Node 0 [ 0.007091] SRAT: PXM 0 -> APIC 0x0b -> Node 0 [ 0.007092] SRAT: PXM 0 -> APIC 0x0c -> Node 0 [ 0.007093] SRAT: PXM 0 -> APIC 0x0d -> Node 0 [ 0.007094] SRAT: PXM 0 -> APIC 0x0e -> Node 0 [ 0.007095] SRAT: PXM 0 -> APIC 0x0f -> Node 0 [ 0.007098] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0xbfffffff] [ 0.007099] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x103fffffff] [ 0.007112] NUMA: Node 0 [mem 0x00000000-0xbfffffff] + [mem 0x100000000-0xff7ffffff] -> [mem 0x00000000-0xff7ffffff] [ 0.007121] NODE_DATA(0) allocated [mem 0xff7fd5000-0xff7ffefff] [ 0.007503] Zone ranges: [ 0.007504] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.007505] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] [ 0.007507] Normal [mem 0x0000000100000000-0x0000000ff7ffffff] [ 0.007508] Device empty [ 0.007509] Movable zone start for each node [ 0.007513] Early memory node ranges [ 0.007514] node 0: [mem 0x0000000000001000-0x000000000009efff] [ 0.007515] node 0: [mem 0x0000000000100000-0x00000000bffe8fff] [ 0.007516] node 0: [mem 0x0000000100000000-0x0000000ff7ffffff] [ 0.007522] Initmem setup node 0 [mem 0x0000000000001000-0x0000000ff7ffffff] [ 0.007827] DMA zone: 28770 pages in unavailable ranges [ 0.013325] DMA32 zone: 23 pages in unavailable ranges [ 0.128485] ACPI: PM-Timer IO Port: 0xb008 [ 0.128498] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) [ 0.128538] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 [ 0.128541] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) [ 0.128543] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.128545] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) [ 0.128546] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) [ 0.128551] Using ACPI (MADT) for SMP configuration information [ 0.128553] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.128562] smpboot: Allowing 16 CPUs, 0 hotplug CPUs [ 0.128591] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.128593] PM: hibernation: Registered nosave memory: [mem 0x0009f000-0x0009ffff] [ 0.128594] PM: hibernation: Registered nosave memory: [mem 0x000a0000-0x000effff] [ 0.128595] PM: hibernation: Registered nosave memory: [mem 0x000f0000-0x000fffff] [ 0.128597] PM: hibernation: Registered nosave memory: [mem 0xbffe9000-0xbfffffff] [ 0.128598] PM: hibernation: Registered nosave memory: [mem 0xc0000000-0xdfffffff] [ 0.128598] PM: hibernation: Registered nosave memory: [mem 0xe0000000-0xe03fffff] [ 0.128599] PM: hibernation: Registered nosave memory: [mem 0xe0400000-0xfffbffff] [ 0.128600] PM: hibernation: Registered nosave memory: [mem 0xfffc0000-0xffffffff] [ 0.128602] [mem 0xc0000000-0xdfffffff] available for PCI devices [ 0.128604] Booting paravirtualized kernel on KVM [ 0.128607] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.128615] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:16 nr_cpu_ids:16 nr_node_ids:1 [ 0.129248] percpu: Embedded 56 pages/cpu s192512 r8192 d28672 u262144 [ 0.129287] setup async PF for cpu 0 [ 0.129294] kvm-stealtime: cpu 0, msr fb8c2e080 [ 0.129301] Built 1 zonelists, mobility grouping on. Total pages: 16224626 [ 0.129302] Policy zone: Normal [ 0.129304] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-1038-aws root=PARTUUID=5198cbc0-01 ro console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1 [ 0.135405] Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes, linear) [ 0.138445] Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes, linear) [ 0.138515] mem auto-init: stack:off, heap alloc:on, heap free:off [ 0.267053] Memory: 64693096K/65928732K available (14339K kernel code, 2545K rwdata, 5476K rodata, 2648K init, 4904K bss, 1235636K reserved, 0K cma-reserved) [ 0.267061] random: get_random_u64 called from kmem_cache_open+0x2d/0x410 with crng_init=0 [ 0.267205] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=16, Nodes=1 [ 0.267222] ftrace: allocating 46691 entries in 183 pages [ 0.284648] ftrace: allocated 183 pages with 6 groups [ 0.284772] rcu: Hierarchical RCU implementation. [ 0.284773] rcu: RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=16. [ 0.284775] Trampoline variant of Tasks RCU enabled. [ 0.284775] Rude variant of Tasks RCU enabled. [ 0.284776] Tracing variant of Tasks RCU enabled. [ 0.284777] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies. [ 0.284778] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=16 [ 0.287928] NR_IRQS: 524544, nr_irqs: 552, preallocated irqs: 16 [ 0.288408] random: crng done (trusting CPU's manufacturer) [ 0.433686] Console: colour VGA+ 80x25 [ 0.949504] printk: console [tty1] enabled [ 1.196291] printk: console [ttyS0] enabled [ 1.200429] ACPI: Core revision 20200528 [ 1.204793] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 30580167144 ns [ 1.213129] APIC: Switch to symmetric I/O mode setup [ 1.217629] Switched APIC routing to physical flat. [ 1.223344] ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 1.228384] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1fb441f3908, max_idle_ns: 440795250092 ns [ 1.237533] Calibrating delay loop (skipped) preset value.. 4398.94 BogoMIPS (lpj=8797896) [ 1.241533] pid_max: default: 32768 minimum: 301 [ 1.245565] LSM: Security Framework initializing [ 1.249543] Yama: becoming mindful. [ 1.253557] AppArmor: AppArmor initialized [ 1.257659] Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear) [ 1.261614] Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear) [ 1.266288] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 512 [ 1.269534] Last level dTLB entries: 4KB 1536, 2MB 1536, 4MB 768, 1GB 0 [ 1.273534] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization [ 1.277533] Spectre V2 : Mitigation: Full AMD retpoline [ 1.281532] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch [ 1.285533] Speculative Store Bypass: Vulnerable [ 1.289807] Freeing SMP alternatives memory: 40K [ 1.406501] smpboot: CPU0: AMD EPYC 7571 (family: 0x17, model: 0x1, stepping: 0x2) [ 1.409675] Performance Events: Fam17h+ core perfctr, AMD PMU driver. [ 1.413537] ... version: 0 [ 1.417532] ... bit width: 48 [ 1.421532] ... generic registers: 6 [ 1.425532] ... value mask: 0000ffffffffffff [ 1.429532] ... max period: 00007fffffffffff [ 1.433532] ... fixed-purpose events: 0 [ 1.437532] ... event mask: 000000000000003f [ 1.441596] rcu: Hierarchical SRCU implementation. [ 1.446253] smp: Bringing up secondary CPUs ... [ 1.449663] x86: Booting SMP configuration: [ 1.453539] .... node #0, CPUs: #1 [ 0.937207] kvm-clock: cpu 1, msr 124a01041, secondary cpu clock [ 1.455817] setup async PF for cpu 1 [ 1.457530] kvm-stealtime: cpu 1, msr fb8c6e080 [ 1.469534] #2 [ 0.937207] kvm-clock: cpu 2, msr 124a01081, secondary cpu clock [ 1.471039] setup async PF for cpu 2 [ 1.473530] kvm-stealtime: cpu 2, msr fb8cae080 [ 1.481657] #3 [ 0.937207] kvm-clock: cpu 3, msr 124a010c1, secondary cpu clock [ 1.485679] setup async PF for cpu 3 [ 1.489530] kvm-stealtime: cpu 3, msr fb8cee080 [ 1.497656] #4 [ 0.937207] kvm-clock: cpu 4, msr 124a01101, secondary cpu clock [ 1.499437] setup async PF for cpu 4 [ 1.501530] kvm-stealtime: cpu 4, msr fb8d2e080 [ 1.513649] #5 [ 0.937207] kvm-clock: cpu 5, msr 124a01141, secondary cpu clock [ 1.515060] setup async PF for cpu 5 [ 1.517530] kvm-stealtime: cpu 5, msr fb8d6e080 [ 1.525659] #6 [ 0.937207] kvm-clock: cpu 6, msr 124a01181, secondary cpu clock [ 1.529602] setup async PF for cpu 6 [ 1.533530] kvm-stealtime: cpu 6, msr fb8dae080 [ 1.541658] #7 [ 0.937207] kvm-clock: cpu 7, msr 124a011c1, secondary cpu clock [ 1.543028] setup async PF for cpu 7 [ 1.545530] kvm-stealtime: cpu 7, msr fb8dee080 [ 1.553662] #8 [ 0.937207] kvm-clock: cpu 8, msr 124a01201, secondary cpu clock [ 1.558560] setup async PF for cpu 8 [ 1.561530] kvm-stealtime: cpu 8, msr fb8e2e080 [ 1.569799] #9 [ 0.937207] kvm-clock: cpu 9, msr 124a01241, secondary cpu clock [ 1.573726] setup async PF for cpu 9 [ 1.577530] kvm-stealtime: cpu 9, msr fb8e6e080 [ 1.585658] #10 [ 0.937207] kvm-clock: cpu 10, msr 124a01281, secondary cpu clock [ 1.587067] setup async PF for cpu 10 [ 1.589530] kvm-stealtime: cpu 10, msr fb8eae080 [ 1.597671] #11 [ 0.937207] kvm-clock: cpu 11, msr 124a012c1, secondary cpu clock [ 1.602918] setup async PF for cpu 11 [ 1.605530] kvm-stealtime: cpu 11, msr fb8eee080 [ 1.613655] #12 [ 0.937207] kvm-clock: cpu 12, msr 124a01301, secondary cpu clock [ 1.617734] setup async PF fo
0
answers
0
votes
32
views
asked 3 days ago

IAM poilcy for an user to access Enhanced Monitoring for RDS.

I am trying to create an IAM user that will have least privileges to be able to view enhanced monitoring for a particular RDS database. I have created a ROLE (Enhanced Monitoring) and attached a managed policy to it:'AmazonRDSEnhancedMonitoringRole'. This role is passed to RDS database using the passrole permission. The policy that I am attaching to this IAM user is as below: ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "cloudwatch:PutMetricData", "rds:*", "cloudwatch:GetMetricData", "iam:ListRoles", "cloudwatch:GetMetricStatistics", "cloudwatch:DeleteAnomalyDetector", "cloudwatch:ListMetrics", "cloudwatch:DescribeAnomalyDetectors", "cloudwatch:ListMetricStreams", "cloudwatch:DescribeAlarmsForMetric", "cloudwatch:ListDashboards", "ec2:*", "cloudwatch:PutAnomalyDetector", "cloudwatch:GetMetricWidgetImage" ], "Resource": "*" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "iam:GetRole", "iam:PassRole", "cloudwatch:*" ], "Resource": [ "arn:aws:cloudwatch:*:accountnumber:insight-rule/*", "arn:aws:iam::accountnumber:role/Enhanced-Monitoring", "arn:aws:rds:us-east-1:accountnumber:db:dbidentifier" ] } ] } ``` As you can see, I have given almost every permission to this user, but still I am getting 'Not Authorized' error on the IAM user RDS dashboard for enhanced monitoring, although cloudwatch logs are displaying normally. I am following this guide (https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_passrole.html) for enhanced monitoring of RDS. Refer to example 2 on this page.
1
answers
0
votes
30
views
asked 11 days ago

Unable to create new OpsItems from EventBridge when using Input Transformer for deduplication and adding category and severity values

Apologize to all for the duplicate post. I created my login under the wrong account when I initially posted this question. I’m able to generate a new OpsItem for any EC2, SecurityGroup, or VPC configuration change using an EventBridge rule with the following event pattern. { "source": "aws.config", "detail-type": "Config Configuration Item Change", "detail": { "messageType": "ConfigurationItemChangeNotification", "configurationItem": { "resourceType": "AWS::EC2::Instance", "AWS::EC2::SecurityGroup", "AWS::EC2::VPC" } } } The rule and target work great when using Matched event for the Input but I noticed that launching one EC2 using the AWS wizard creates at least three OpsItems, one for each resourceType. Therefore I’d like to implement a deduplication string to cut down on the number of OpsItems generated to one if possible and I’d also like to attach a category and severity to the new OpsItem. I’m trying to use an Input Transformer as recommended by the AWS documentation but even the most simplest of Input Transformers when applied prevent any new OpsItems from being generated. When I've tested, I've also ensured that all previous OpsItems were resolved. Can anyone tell me what might be blocking the creation of any new OpsItems when using this Input Transformer configuration? Here’s what I have configured now. Input path { "awsAccountId": "$.detail.configurationItem.awsAccountId", "awsRegion": "$.detail.configurationItem.awsRegion", "configurationItemCaptureTime": "$.detail.configurationItem.configurationItemCaptureTime", "detail-type": "$.detail-type", "messageType": "$.detail.messageType", "notificationCreationTime": "$.detail.notificationCreationTime", "region": "$.region", "resourceId": "$.detail.configurationItem.resourceId", "resourceType": "$.detail.configurationItem.resourceType", "resources": "$.resources", "source": "$.source", "time": "$.time" } Input template { "awsAccountId": "<awsAccountId>", "awsRegion": "<awsRegion>", "configurationItemCaptureTime": "<configurationItemCaptureTime>", "resourceId": "<resourceId>", "resourceType": "<resourceType>", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "<detail-type>", "source": "<source>", "time": "<time>", "region": "<region>", "resources": "<resources>", "messageType": "<messageType>", "notificationCreationTime": "<notificationCreationTime>", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{\"dedupString\":\"ConfigurationItemChangeNotification\"}" } } } Output when using the AWS supplied Sample event called “Config Configuration Item Change” { "awsAccountId": "123456789012", "awsRegion": "us-east-1", "configurationItemCaptureTime": "2022-03-16T01:10:50.837Z", "resourceId": "fs-01f0d526165b57f95", "resourceType": "AWS::EFS::FileSystem", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "Config Configuration Item Change", "source": "aws.config", "time": "2022-03-16T01:10:51Z", "region": "us-east-1", "resources": "arn:aws:elasticfilesystem:us-east-1:123456789012:file-system/fs-01f0d526165b57f95", "messageType": "ConfigurationItemChangeNotification", "notificationCreationTime": "2022-03-16T01:10:51.976Z", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{"dedupString":"ConfigurationItemChangeNotification"}" } } }
1
answers
0
votes
24
views
asked 2 months ago

Unable to create new OpsItems from EventBridge when using Input Transformer for deduplication and adding category and severity values

I’m able to generate a new OpsItem for any EC2, SecurityGroup, or VPC configuration change using an EventBridge rule with the following event pattern. { "source": ["aws.config"], "detail-type": ["Config Configuration Item Change"], "detail": { "messageType": ["ConfigurationItemChangeNotification"], "configurationItem": { "resourceType": ["AWS::EC2::Instance", "AWS::EC2::SecurityGroup", "AWS::EC2::VPC"] } } } The rule and target work great when using Matched event for the Input but I noticed that launching one EC2 using the AWS wizard creates at least three OpsItems, one for each resourceType. Therefore I’d like to implement a deduplication string to cut down on the number of OpsItems generated to one if possible and I’d also like to attach a category and severity to the new OpsItem. I’m trying to use an Input Transformer as recommended by the AWS documentation but even the most simplest of Input Transformers when applied prevent any new OpsItems from being generated. When I've tested, I've also ensured that all previous OpsItems were resolved. Can anyone tell me what might be blocking the creation of any new OpsItems when using this Input Transformer configuration? Here’s what I have configured now. Input path { "awsAccountId": "$.detail.configurationItem.awsAccountId", "awsRegion": "$.detail.configurationItem.awsRegion", "configurationItemCaptureTime": "$.detail.configurationItem.configurationItemCaptureTime", "detail-type": "$.detail-type", "messageType": "$.detail.messageType", "notificationCreationTime": "$.detail.notificationCreationTime", "region": "$.region", "resourceId": "$.detail.configurationItem.resourceId", "resourceType": "$.detail.configurationItem.resourceType", "resources": "$.resources", "source": "$.source", "time": "$.time" } Input template { "awsAccountId": "<awsAccountId>", "awsRegion": "<awsRegion>", "configurationItemCaptureTime": "<configurationItemCaptureTime>", "resourceId": "<resourceId>", "resourceType": "<resourceType>", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "<detail-type>", "source": "<source>", "time": "<time>", "region": "<region>", "resources": "<resources>", "messageType": "<messageType>", "notificationCreationTime": "<notificationCreationTime>", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{\"dedupString\":\"ConfigurationItemChangeNotification\"}" } } } Output when using the AWS supplied Sample event called “Config Configuration Item Change” { "awsAccountId": "123456789012", "awsRegion": "us-east-1", "configurationItemCaptureTime": "2022-03-16T01:10:50.837Z", "resourceId": "fs-01f0d526165b57f95", "resourceType": "AWS::EFS::FileSystem", "title": "Template under ConfigDrift-EC2-Dedup4", "description": "Configuration Drift Detected.", "category": "Security", "severity": "3", "origination": "EventBridge Rule - ConfigDrift-EC2-Dedup", "detail-type": "Config Configuration Item Change", "source": "aws.config", "time": "2022-03-16T01:10:51Z", "region": "us-east-1", "resources": "arn:aws:elasticfilesystem:us-east-1:123456789012:file-system/fs-01f0d526165b57f95", "messageType": "ConfigurationItemChangeNotification", "notificationCreationTime": "2022-03-16T01:10:51.976Z", "operationalData": { "/aws/dedup": { "type": "SearchableString", "value": "{"dedupString":"ConfigurationItemChangeNotification"}" } } }
0
answers
0
votes
1
views
asked 2 months ago

AWS SDK SQS get number of messages in a dead letter queue

Hello community, I somehow can't find the right information. I have following simple task to solve: create a lambda that checks if a dead letter queue has messages and if it has, read how many. Before I did that I had an alarm set on an SQS metric. I chose the 'ApproximateNumberOfMessagesVisible' metric since 'NumberOfMessagesSent' (which was my first choice) does not work for DLQueues. I have read this article: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html. >The NumberOfMessagesSent and NumberOfMessagesReceived for a dead-letter queue don't match > If you send a message to a dead-letter queue manually, it is captured by the NumberOfMessagesSent metric. However, if a message is sent to a dead-letter queue as a result of a failed processing attempt, it isn't captured by this metric. Thus, it is possible for the values of **NumberOfMessagesSent** and NumberOfMessagesReceived to be different. That is nice to know, but I was missing the information: which metric shall I use if **NumberOfMessagesSent** won't work? I was being pragmatic here so I created an error, a message was sent to the DLQ as a result of a failed processing attempt. Now I looked at the queue in the AWS console under the monitoring-tab and I checked which metric spiked. It was **ApproximateNumberOfMessagesVisible**, which sounded suitable, so I used it. Now I wanted to get alerted more often so I chose to build a lambda function that checks how many messages are in the DLQueue. I use Javascript / Typescript so I found this: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_GetQueueAttributes.html. Code looked something like this: ``` const params = { QueueUrl: url, AttributeNames: ['ApproximateNumberOfMessagesVisible'] } const resp = SQS.getQueueAttributes(params).promise() ``` It was kind of a bummer that the attribute I wanted was not in there, or better: it was not valid. > Valid Values: All | Policy | VisibilityTimeout | MaximumMessageSize | MessageRetentionPeriod | ApproximateNumberOfMessages | ApproximateNumberOfMessagesNotVisible | CreatedTimestamp | LastModifiedTimestamp | QueueArn | ApproximateNumberOfMessagesDelayed | DelaySeconds | ReceiveMessageWaitTimeSeconds | RedrivePolicy | FifoQueue | ContentBasedDeduplication | ... My first attempt was to use CloudWatch metrics. So I tried this: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/cloudwatch-examples-getting-metrics.html ``` var params = { Dimensions: [ { Name: 'LogGroupName', /* required */ }, ], MetricName: 'IncomingLogEvents', Namespace: 'AWS/Logs' }; cw.listMetrics(params, function(err, data) { if (err) { console.log("Error", err); } else { console.log("Metrics", JSON.stringify(data.Metrics)); } }); ``` but I could not get this working since I did not know what to add to Dimensions / Name to make this working. Please note that I am not working very long with AWS (only 6 months). Maybe I am on a total wrong track. Summarized: I want to achieve that my lambda gets the number of messages in a DLQ. I hope someone can help me Cheers Aleks
1
answers
0
votes
48
views
asked 3 months ago

Elastic beanstalk Enhanced health not generating healthd/application.log files

I have Enhanced health reporting turned on for my Elastic beanstalk environment. The environment is: 1. Multicontainer docker setup running in “Amazon Linux 2” 2. It has an nginx proxy (Configuration > Software shows: Log streaming: disabled / Proxy server: nginx / Rotate logs: disabled / X-Ray daemon: disabled) 3. Enhanced monitoring is on (Configuration > Monitoring shows: CloudWatch Custom Metrics-Environment: CloudWatch Custom Metrics-Instance: / Health event log streaming: disabled / Ignore HTTP 4xx: enabled / Ignore load balancer 4xx: disabled System: Enhanced) However, on the Health page, none of the requests, response, or latency fields are populating, while load & CPU utilization are populating. It is my understanding that this data is populated from a log file that is written to `/var/log/nginx/healthd/`, but that directory is empty. It seems like this is a bug or some sort of misconfiguration. Does anyone know why this might be happening? I included some relevant info from the machine below. --- The healthd config file (I commented out the `group_id`, which is a uuid in the actual file): ``` $ cat /etc/healthd/config.yaml group_id: XXXX log_to_file: true endpoint: https://elasticbeanstalk-health.us-east-2.amazonaws.com appstat_log_path: /var/log/nginx/healthd/application.log appstat_unit: sec appstat_timestamp_on: completion ``` The output of the healthd daemon log—showing warnings for not finding previous application.log.YYYY-MM-DD-HH files: ``` $ head /var/log/healthd/daemon.log # Logfile created on 2022-04-02 21:02:22 +0000 by logger.rb/66358 A, [2022-04-02T21:02:24.123304 #4122] ANY -- : healthd daemon 1.0.6 initialized W, [2022-04-02T21:02:24.266469 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:29.266806 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:34.404332 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:39.406846 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:44.410108 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:49.410342 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:54.410611 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist W, [2022-04-02T21:02:59.410860 #4122] WARN -- : log file "/var/log/nginx/healthd/application.log.2022-04-02-21" does not exist ``` The /var/logs/nginx/ directory with perms and ownership. Is `nginx` supposed to own healthd? ``` $ ls -l /var/log/nginx/ total 12 -rw-r--r-- 1 root root 11493 Apr 4 21:15 access.log drwxr-xr-x 2 nginx nginx 6 Apr 2 21:01 healthd drwxr-xr-x 2 root root 6 Apr 2 21:02 rotated ``` The empty /var/logs/nginx/healthd/ directory: ``` $ ls /var/log/nginx/healthd/ # this directory is empty ```
1
answers
3
votes
67
views
asked 3 months ago

Proper conversion of AWS Log Insights to Metrics for visualization and monitoring

TL;DR; ---- What is the proper way to create a metric so that it generates reliable information about the log insights? What is desired ------ The current Log insights can be seen similar to the following [![AWS Log insights][1]][1] However, it becomes easier to analyse these logs using the metrics (mostly because you can have multiple sources of data in the same plot and even perform math operations between them). Solution according to docs ----- Allegedly, a log can be converted to a metric filter following a guide like [this][2]. However, this approach does not seem to work entirely right (I guess because of the time frames that have to be imposed in the metric plots), providing incorrect information, for example: [![Dashboard][3]][3] Issue with solution ----- In the previous image I've created a dashboard containing the metric count (the number 7), corresponding to the sum of events each 5 minutes. Also I've added a preview of the log insight corresponding to the information used to create the event. However, as it can be seen, the number of logs is 4, but the event count displays 7. Changing the time frame in the metric generates other types of issues (e.g., selecting a very small time frame like 1 sec won't retrieve any data, or a slightly smaller time frame will now provide another wrong number: 3, when there are 4 logs, for example). P.S. ----- I've also tried converting the log insights to metrics using [this lambda function][4] as suggested by [Danil Smirnov][5] to no avail, as it seems to generate the same issues. [1]: https://i.stack.imgur.com/0pPdp.png [2]: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CountingLogEventsExample.html [3]: https://i.stack.imgur.com/Dy5td.png [4]: https://serverlessrepo.aws.amazon.com/#!/applications/arn:aws:serverlessrepo:us-east-1:085576722239:applications~logs-insights-to-metric [5]: https://blog.smirnov.la/cloudwatch-logs-insights-to-metrics-a2d197aac379
0
answers
0
votes
11
views
asked 4 months ago

SageMaker - All metrics in statistics.json by Model Quality Monitor are "0.0 +/- 0.0", but confusion matrix is built correctly for multi-class classification!!

I have scheduled an hourly model-quality-monitoring job in AWS SageMaker. both the jobs, ground-truth-merge and model-quality-monitoring completes successfully without any errors. but, all the metrics calculated by the job are "0.0 +/- 0.0" while the confustion matrix gets calculated as expected. I have done everything as mentioned in [this notebook for model-quality-monitoring from sagemaker-examples](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_model_monitor/model_quality/model_quality_churn_sdk.ipynb) with very few changes and they are: 1. I have changed the model from xgboost churn to model trained on my data. 2. my input to the endpoint was csv like in the example-notebook, but output was json. 3. i have changed the problem-type from BinaryClassfication to MulticlassClassification wherever necessary. confustion matrix was built successfully, but all metrics are 0 for some reason. So, I would like the monitoring job to calculate the multi-classification metrics on data properly. **All Logs** Here's the `statistics.json` file that model-quality-monitor saved to S3 with confustion matrix built, but with 0s in all the metrics: ``` { "version" : 0.0, "dataset" : { "item_count" : 4432, "start_time" : "2022-02-23T03:00:00Z", "end_time" : "2022-02-23T04:00:00Z", "evaluation_time" : "2022-02-23T04:13:20.193Z" }, "multiclass_classification_metrics" : { "confusion_matrix" : { "0" : { "0" : 709, "2" : 530, "1" : 247 }, "2" : { "0" : 718, "2" : 497, "1" : 265 }, "1" : { "0" : 700, "2" : 509, "1" : 257 } }, "accuracy" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_recall" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_precision" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f0_5" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f1" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f2" : { "value" : 0.0, "standard_deviation" : 0.0 }, "accuracy_best_constant_classifier" : { "value" : 0.3352888086642599, "standard_deviation" : 0.003252410977346705 }, "weighted_recall_best_constant_classifier" : { "value" : 0.3352888086642599, "standard_deviation" : 0.003252410977346705 }, "weighted_precision_best_constant_classifier" : { "value" : 0.1124185852154987, "standard_deviation" : 0.0021869336610830254 }, "weighted_f0_5_best_constant_classifier" : { "value" : 0.12965524348784485, "standard_deviation" : 0.0024239410000317335 }, "weighted_f1_best_constant_classifier" : { "value" : 0.16838092925822584, "standard_deviation" : 0.0028615098045768348 }, "weighted_f2_best_constant_classifier" : { "value" : 0.24009212108475822, "standard_deviation" : 0.003326031863819311 } } } ``` Here's how couple of lines of captured data looks like(*prettified for readability, but each line has no tab spaces as shown below*) : ``` { "captureData": { "endpointInput": { "observedContentType": "text/csv", "mode": "INPUT", "data": "0,1,628,210,30", "encoding": "CSV" }, "endpointOutput": { "observedContentType": "application/json", "mode": "OUTPUT", "data": "{\"label\":\"Transfer\",\"prediction\":2,\"probabilities\":[0.228256680901919,0.0,0.7717433190980809]}\n", "encoding": "JSON" } }, "eventMetadata": { "eventId": "a7cfba60-39ee-4796-bd85-343dcadef024", "inferenceId": "5875", "inferenceTime": "2022-02-23T04:12:51Z" }, "eventVersion": "0" } { "captureData": { "endpointInput": { "observedContentType": "text/csv", "mode": "INPUT", "data": "0,3,628,286,240", "encoding": "CSV" }, "endpointOutput": { "observedContentType": "application/json", "mode": "OUTPUT", "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.99,0.005,0.005]}\n", "encoding": "JSON" } }, "eventMetadata": { "eventId": "7391ac1e-6d27-4f84-a9ad-9fbd6130498a", "inferenceId": "5876", "inferenceTime": "2022-02-23T04:12:51Z" }, "eventVersion": "0" } ``` Here's couple of lines from my ground-truths that I have uploaded to S3 look like(*prettified for readability, but each line has no tab spaces as shown below*): ``` { "groundTruthData": { "data": "0", "encoding": "CSV" }, "eventMetadata": { "eventId": "1" }, "eventVersion": "0" } { "groundTruthData": { "data": "1", "encoding": "CSV" }, "eventMetadata": { "eventId": "2" }, "eventVersion": "0" }, ``` Here's couple of lines from the ground-truth-merged file look like(*prettified for readability, but each line has no tab spaces as shown below*). this file is created by the ground-truth-merge job, which is one of the two jobs that model-quality-monitoring schedule runs: ``` { "eventVersion": "0", "groundTruthData": { "data": "2", "encoding": "CSV" }, "captureData": { "endpointInput": { "data": "1,2,1050,37,1095", "encoding": "CSV", "mode": "INPUT", "observedContentType": "text/csv" }, "endpointOutput": { "data": "{\"label\":\"Return_to_owner\",\"prediction\":1,\"probabilities\":[0.14512373737373732,0.6597074314574313,0.1951688311688311]}\n", "encoding": "JSON", "mode": "OUTPUT", "observedContentType": "application/json" } }, "eventMetadata": { "eventId": "c9e21f63-05f0-4dec-8f95-b8a1fa3483c1", "inferenceId": "4432", "inferenceTime": "2022-02-23T04:00:00Z" } } { "eventVersion": "0", "groundTruthData": { "data": "1", "encoding": "CSV" }, "captureData": { "endpointInput": { "data": "0,2,628,5,90", "encoding": "CSV", "mode": "INPUT", "observedContentType": "text/csv" }, "endpointOutput": { "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.7029623691085284,0.0,0.29703763089147156]}\n", "encoding": "JSON", "mode": "OUTPUT", "observedContentType": "application/json" } }, "eventMetadata": { "eventId": "5f1afc30-2ffd-42cf-8f4b-df97f1c86cb1", "inferenceId": "4433", "inferenceTime": "2022-02-23T04:00:01Z" } } ``` Since, the confusion matrix was constructed properly, I presume that I fed the data to sagemaker-model-monitor the right-way. But, why are all the metrics 0.0, while confustion-matrix looks as expected? EDIT 1: Logs for the job are available [here](https://controlc.com/1e1781d2).
0
answers
1
votes
10
views
asked 4 months ago
  • 1
  • 90 / page