Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2033 running against z/OS #75

Closed
CharlesVindum opened this issue Jul 5, 2021 · 13 comments
Closed

2033 running against z/OS #75

CharlesVindum opened this issue Jul 5, 2021 · 13 comments

Comments

@CharlesVindum
Copy link

CharlesVindum commented Jul 5, 2021

Hi

I'm trying to get the monitor running on Linux against z/OS MQ server, but getting this error:

oot@b0633s01:/home/bdujsk/mq-exporter/mq-metric-samples-master/scripts# ./runMonitor.sh mq_prometheus
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
/root/tmp/mq-metric-samples/bin
mq_prometheus
Local address is
IBM MQ metrics exporter for Prometheus monitoring
Build : 20210701-134444
Build Platform: Linux/x86_64
INFO[0000] Connected to queue manager MQT1
FATA[0030] MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]

There are NO errors in the MQ log or in RACF, and no messages in the DLQ

I can see the connection is running, using MQ Explore, and the client is waiting on an temporary queue

her's my definitions:
global:
useObjectStatus: true
useResetQStats: false
logLevel: DEBUG
metaprefix: ""
pollInterval: 120s
rediscoverInterval: 1h
tzOffset: 0h
connection:
queueManager: MQT1
ccdtUrl:
connName:
channel:
clientConnection: false
replyQueue: SYSTEM.DEFAULT.MODEL.QUEUE
objects:
queues:

  • "KTO.*"
  • "APP.*"
  • "!SYSTEM.*"
  • "!AMQ.*"
  • QM*
    queueSubscriptionSelector:
  • PUT
  • GET
  • GENERAL
    channels:
  • SYSTEM.*
  • TO.*
  • MQS1.*
    topics:
    subscriptions:
    showInactiveChannels: false

prometheus:
port: 9157
metricsPath: "/metrics"
namespace: ibmmq

This is an example of running one of the containers containing

. ./common.sh
if [ -z "$1" ]
then
echo "Must provide a collector name such as 'mq_prometheus'"
exit 1
fi
mon=$1
monbase=echo $mon | sed "s/mq_//g"
TAG=mq-metric-$monbase
echo $OUTDIR
echo $mon
echo "Local address is $addr"
port="1414"
addr="sysv.bankdata.lan"
docker run --rm -p 9290:9157
-e MQSERVER="GIJ_CLI/TCP/$addr($port)"
-v $OUTDIR/$mon.yaml:/opt/config/$mon.yaml
-it $TAG:$VER

/Charles Vindum, Bankdata

@ibmmqmet
Copy link
Collaborator

ibmmqmet commented Jul 6, 2021

The log output you provide doesn't correspond to what you should get with loglevel=debug. At minimum I'd expect to see the active configuration printed out. So I wonder if you're actually picking up the correct config file.

@CharlesVindum
Copy link
Author

Hi Mark
I'm quiet sure I'm running the correct config file, if I change the queuemanager to XXXX instead of MQT1 I get an queuemanager not know (2058). I've tried loglevel: debug, Debug and DEBUG, with the same result. Could the reason be we're running container, and the logging is elsewhere?
/Charles

@ibmmqmet
Copy link
Collaborator

ibmmqmet commented Jul 8, 2021

If you've been using the sample Dockerfiles unchanged, then you might be picking up the env var set in Dockerfile.run that also sets a loglevel - env vars have priority over config files or command line.

@ubijsk
Copy link

ubijsk commented Jul 9, 2021

Hi Mark
Right Dockerfile.run sets the default loglevel, changed that now and more log comes.

Hope you can see what's wrong:

Exposing the port to 9290 ( docker run --rm -p 9290:9157)

Have expected to get metric but no metrics shows up.

When docker container runs :

root@b0633s01:~# curl http://localhost:9290/metrics
curl: (56) Recv failure: Connection reset by peer

When no docker container runs :

root@b0633s01:~# curl http://localhost:9290/metrics
curl: (7) Failed connect to localhost:9290; Connection refused

Debug logging :

root@b0633s01:/home/bdujsk/mq-exporter/mq-metric-samples-master/scripts# ./runMonitor.sh mq_prometheus
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
/root/tmp/mq-metric-samples/bin
mq_prometheus
Local address is
IBM MQ metrics exporter for Prometheus monitoring
Build : 20210709-112359
Build Platform: Linux/x86_64

DEBU[0000] VerifyConfig Config: {cf:{ConfigFile:/opt/config/mq_prometheus.yaml QMgrName:MQT1 ReplyQ:SYSTEM.DEFAULT.MODEL.QUEUE MetaPrefix: TZOffsetString:0h Locale: MonitoredQueues:KTO.,APP.,SYSTEM.,!AMQ.,QM* MonitoredQueuesFile: MonitoredChannels:SYSTEM.,TO.,KTO.* MonitoredChannelsFile: MonitoredTopics: MonitoredTopicsFile: MonitoredSubscriptions: MonitoredSubscriptionsFile: QueueSubscriptionSelector:PUT,GET,GENERAL LogLevel:DEBUG pollInterval:30s PollIntervalDuration:0 rediscoverInterval:1h RediscoverDuration:0 CC:{ClientMode:false UserId: Password: TZOffsetSecs:0 UsePublications:false UseStatus:true UseResetQStats:false ShowInactiveChannels:true CcdtUrl: ConnName: Channel:}} httpListenPort:9157 httpListenHost: httpMetricPath:/metrics namespace:ibmmq httpsCertFile: httpsKeyFile:}
DEBU[0000] Monitored topics are ''
DEBU[0000] Connecting to queue manager MQT1
INFO[0000] Connected to queue manager MQT1
FATA[0030] MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]

@ibmmqmet
Copy link
Collaborator

The only things I can think of are unlikely such as

  • the z/OS command server isn't reading from SYSTEM.COMMAND.INPUT quee
  • it's extremely slow responding
    Setting a loglevel of TRACE might give a better idea of which command isn't getting a response

@ibmmqmet
Copy link
Collaborator

I was able to reproduce with the debug showing the odd responsetype. And I've found what's going on. Looping round the MQGET for buffer increases to get larger response messages modifies the MQMD so that data conversion was not performed on the 2nd loop.

The temporary fix of increasing that buffer will work; I've got a proper fix that I'll push in the next update.

@CharlesVindum
Copy link
Author

CharlesVindum commented Sep 17, 2021 via email

@ibmmqmet
Copy link
Collaborator

I think I've got a fix for that too

@CharlesVindum
Copy link
Author

CharlesVindum commented Sep 17, 2021 via email

ibmmqmet added a commit that referenced this issue Sep 20, 2021
* Add script to build images using buildah + Red Hat UBI base
* Temporarily override vendored mq-golang/mqmetric code to deliver fixes
  * Deal with buffer expansion when there are lots of queues to query AND remote system is different CCSID (#75)
  * Ensure labels - in particular object DESCR values - are valid UTF8
@ibmmqmet
Copy link
Collaborator

hopefully fixed now

@CharlesVindum
Copy link
Author

CharlesVindum commented Sep 21, 2021 via email

@hamid-vakil
Copy link

There is also this problem for me, and getting this TRACE & DEBUG levels massages:

level=debug msg="VerifyConfig Config: {cf:{ConfigFile:/opt/config/mq_prometheus.yaml QMgrName:ACKQM ReplyQ:SYSTEM.DEFAULT.MODEL.QUEUE ReplyQ2: MetaPrefix: TZOffsetString:0h Locale: MonitoredQueues:APP.,!SYSTEM.,!AMQ.,QM MonitoredQueuesFile: MonitoredChannels:SYSTEM.,TO. MonitoredChannelsFile: MonitoredTopics: MonitoredTopicsFile: MonitoredSubscriptions: MonitoredSubscriptionsFile: QueueSubscriptionSelector:PUT,GET,GENERAL LogLevel:DEBUG pollInterval:120s PollIntervalDuration:0 rediscoverInterval:1h RediscoverDuration:0 CC:{ClientMode:false UserId:mqm Password:mqm TZOffsetSecs:0 SingleConnect:false UsePublications:true UseStatus:true UseResetQStats:false ShowInactiveChannels:false CcdtUrl: ConnName:10.15.15.10(1415) Channel:SYSTEM.ADMIN.SVRCONN}} httpListenPort: httpListenHost: httpMetricPath: namespace: httpsCertFile: httpsKeyFile: keepRunning:false reconnectIntervalDuration:0 reconnectInterval:5s}"

level=debug msg="In main loop: qMgrConnected=false"
level=info msg="Trying to connect as client using ConnName: 10.15.15.10(1415), Channel: SYSTEM.ADMIN.SVRCONN"
level=debug msg="Connecting to queue manager ACKQM"
level=debug msg="HTTP server - waiting until MQ connection ready"
level=info msg="Connected to queue manager ACKQM"
level=error msg="Connection to ACKQM has failed. MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=info msg=Done.

ibm-mq-exporter | IBM MQ metrics exporter for Prometheus monitoring
ibm-mq-exporter | Build : 20220515-192928
ibm-mq-exporter | Commit Level : 19f1a22
ibm-mq-exporter | Build Platform: Linux/unknown

ADMIN.COMMAND.QUEUE} replyQObj:{hObj:103 qMgr:0xc0002c2300 Name:AMQ.62A80FD8261ADBE3} qMgrObject:{hObj:101 qMgr:0xc0002c2300 Name:} replyQBaseName:SYSTEM.DEFAULT.MODEL.QUEUE replyQ2BaseName: statusReplyQObj:{hObj:104 qMgr:0xc0002c2300 Name:AMQ.62A80FD8261ADBE4} statusReplyBuf:[] platform:3 commandLevel:800 maxHandles:256 resolvedQMgrName:ACKQM qmgrConnected:true queuesOpened:true subsOpened:false}"
level=trace msg="< [InitConnection] rp: 0 Error: "
level=info msg="Connected to queue manager ACKQM"
level=trace msg="> [DiscoverAndSubscribe]"
level=trace msg="> [discoverAndSubscribe]"
level=trace msg="> [discoverStats]"
level=trace msg="> [discoverClasses]"
level=trace msg="> [subscribeWithOptions]"
level=trace msg="< [subscribeWithOptions] rp: 0 Error: nil"
level=trace msg="> [getMessageWithHObj]"
level=trace msg="< [getMessageWithHObj] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=trace msg="> [parsePCFResponse]"
level=trace msg="< [parsePCFResponse] rp: 1"
level=trace msg="< [discoverClasses] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=trace msg="< [discoverStats] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=trace msg="< [discoverAndSubscribe] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=trace msg="< [DiscoverAndSubscribe] rp: 0 Error: MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=trace msg="> [RediscoverAttributes]"
level=trace msg="> [inquireChannelAttributes]"
level=trace msg="> [statusClearReplyQ]"
level=trace msg="< [statusClearReplyQ] rp: 0"
level=trace msg="> [statusSetCommandHeaders]"
level=trace msg="< [statusSetCommandHeaders] rp: 0"
level=trace msg="> [statusGetReply]"
level=trace msg="< [statusGetReply] rp: 0 Error: nil"
level=trace msg="> [parseChannelAttrData]"
level=trace msg="< [parseChannelAttrData] rp: 0"
level=trace msg="> [statusGetReply]"
...
...
...
level=trace msg="< [statusGetReply] rp: 1 CFH: &{Type:2 StrucLength:36 Version:1 Command:25 MsgSeqNumber:1 Control:0 CompCode:2 Reason:2085 ParameterCount:0} Error: "
level=trace msg="> [statusGetReply]"
level=trace msg="< [statusGetReply] rp: 1 CFH: &{Type:2 StrucLength:36 Version:1 Command:25 MsgSeqNumber:2 Control:1 CompCode:2 Reason:3008 ParameterCount:0} Error: "
level=trace msg="< [inquireChannelAttributes] rp: 0"
level=trace msg="< [RediscoverAttributes] rp: 0 Error: nil"
level=error msg="Connection to ACKQM has failed. MQGET: MQCC = MQCC_FAILED [2] MQRC = MQRC_NO_MSG_AVAILABLE [2033]"
level=info msg=Done.

@ibmmqmet
Copy link
Collaborator

You're trying to monitor a very old and way-out-of-support V8 queue manager (commandLevel shows that). It's nothing to do with the original problem reported here. Setting usePublications=false should allow it to continue in a limited fashion, but better would be to upgrade to a supported level of MQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants