SCore-D Administration Manual (8)


SCore-D Startup
dqt [-tq time_slice] [-loghost syslog_host[:portno]] [-dqtmhost dqtm_host[:portno]] [-server server_peno]
tss [-tq time_slice] [-loghost syslog_host[:portno]] [-server server_peno]
System Management
SCore-D Device Management
sc_device node_num device_name device_path arg1 arg2 ... argn
sc_device [-r] device_name
Monitoring Daemons
sc_syslog [-t] [logfile]


SCore-D provides a multiple user, multiple parallel programming environment. The name of SCore-D is generic, and there could be several programs depending on scheduling policy, and so on. A SCore-D program, dqt, schedules user parallel processes in Distributed Queue Tree with that user parallel processes are multiplexed in time and space domains. Another SCore-D program, tss, schedules user parallel processes in a simple TSS fashion.

There are several peripheral programs supporting SCore-D. dqt_monitor program is a server for monitoring DQT load information. sc_syslog program is also a server for monitoring SCore-D load information.

To startup SCore-D environment on a cluster, follow the sequence:

  1. Invoke dqt_monitor and/or sc_syslog if needed. dqt_monitor is a sequential (not a SCore-S/D program) for distributing dqt load information for its client. sc_syslog is also a sequential program for distributing SCore-D system information. Those programs can run on a host out of cluster.
  2. Invoke dqt or tss on a cluster. Either program is runnable as an SCore-S program. Refer SCore User Manual for their SCore options. If dqt_monitor (dqt only) and/or sc_syslog have/has been invoked, corresponding option(s) described later must be given.

sc_device, sc_shutdown and sc_sync are SCore-D applications. sc_device SCore-D program is an SCore-D application to add a new SCore-D device or remove an existing SCore-D device. sc_shutdown program shutdowns SCore-D. sc_sync SCore-D program is an alternative of update UNIX deamon program. sc_sync synchronizes (flushs) i-node cache periodically (30 sec interval).

SCore-D Startup

When dqt or tss are invoked, SCore-D server host node that can accept users job submission is the host having largest node number. dqt and tss are SCore-S programs. Thus they take normal SCore options described in SCore User Manual. dqt or tss program accept following options:

-tq time_slice
Specify interval of time-sharing (time slice) in msec unit.
-server server_peno
Specify the server host processor in processor number.
-loghost syslog_host
Specify the hostname where sc_syslog is running and port number in hostname[:portno] form.
-dqtmhost dqtm_host (dqt only)
Specify the hostname where dqt_monitor is running and port number in hostname[:portno] form. This option is only effective with dqt.
Termination Detection
SCore-D monitors activities of user parallel processes. When SCore-D detects a explicit termination due to a deadlock, SCore-D kills the terminated user parallel process.
User-ID and Current Working Directory
SCore-D fork() and exec() user programs, and set user-ID and change current working directory. When users are using NFS (Network File System) or AFS (Andrew File System), the PWD environment variable is not set properly under some shell. In this case SCore-D fails to spawn user processes. To avoid this, user tcsh or others which set the PWD environment variable properly. Further, because of protection, user-ID must be larger than or equal to 100. SCore-D also fails to spawn user processes, if it has no ROOT permission.
Temporary Directory
When a user submit a job to SCore-D, firstly SCore-D copies the user program binary file to a local temporary directory, /var/scored/. When SCore-D starts up, each SCore-D process checks if the directory exists or not. And if the directory is not present, then SCore-D tries to create the directory. However, /var directory has usually root permission, and SCore-D fails its initialization. To avoid this, an administrater must create the directory on each cluster host at very first time.


SCore User Manual

System Management

sc_shutdown shutdowns SCore-D.

Error checking is weak.

SCore-D User Manual

SCore-D Device Management

sc_device without -r option adds an SCore-D device and invokes corresponding device server process and device processes that are derived (forked) from the device server process on the node specified by node_num option. The device name is given by device_name option that is refered by user programs. The device program invoked by SCore-D must be specified by device_path and the device_path SHOULD be an absolute path name. The specified program is assumed to be wriiten accroding to SCore-D Device Programming Manual. Arguments for device server invokation can be followed by. With -r option, sc_device tries to remove the device specified by device_name option.

Specify the node number where device server and device processes are invoked.
[-r] device_name
Specify the registered name of the device, so that user program can refer. With -r keywork, the specified device is try to be removed.
Specify device program pathname.

Error checking is weak.

SCore-D Device Programming Manual, SCore-D User Manual

SCore-D System Logging and Monitoring Daemons

sc_syslog logs users' login and logout. Logging file can be specified in command arguments. Default logging filename is scored.mesg. Sending the SIGHUP signal to this daemon process, the program tries to reopen the logging file. If -t option is specified, then the program simply output the logging information to its standard output.
dqt_monitor is a daemon server for collecting scheduling information of a DQT (Distributed Queue Tree).
sc_syslog program can take the following options;

Log is output to ATDOUT instead of a journal file. Debug purpose.
Specify journal filename. Default is "scored.mesg."


SCore-D Monitor Applet, SCore-D DQT Monitor Applet

There is a JAVA program to watch SCore-D user process status in real time. And if this applet client is preferable, then the sc_syslog program should run on a host where httpd (HTTP daemon process) is running. Otherwise, message board client applet can not establish a socket conenction.

Real World Computing Partnership
Parallel Distributed System Software Tsukuba Laboratory

$Id: scored-adm.html,v 1.8 1998/06/12 03:56:40 hori Exp $