scicomp.jlab.org

Infiniband Fabric

Thu, 02/14/2013 - 14:38 — watson

Infiniband Fabric

The different clusters have different optimizations of the Infiniband network fabric. Both have what is referred to as a "fat tree" topology. Individual nodes are connected to "leaf switches", and the leaf switches are connected to "core" switches. They differ in the amount of bandwidth between leaf switches.

Read more about Infiniband Fabric

GPU Specifications

Wed, 02/13/2013 - 15:34 — watson

Host Details

19g (2019 GeForce RTX 2080) -- 32 nodes, octa RTX 2080 GPU, 24 Intel(R) Xeon(R) Gold 5118 cores, 196 GB memory, Omni-Path fabric (100 Gb/s), 1TB disk.

12k (2012 Kepler) -- 46 nodes, 16 Intel 2.0 GHz cores, 128 GB memory, quad GPU (Kepler K20m), FDR (56 Gb/s) Infiniband, 1 TB disk.

GPU cards

Tesla cards (w/ ECC memory)

Read more about GPU Specifications

Scientific Computing at Jefferson Lab

Wed, 02/13/2013 - 11:49 — Anonymous (not verified)

(this will eventually be an outline plus most important news, upcoming events, etc.)

Scientific Computing consists of two main systems, one for Experimental Physics and the other for Lattice QCD (theory) computing. Many resources (file servers, offline storage, wide area networking) are shared between the two, with appropriate allocations.

See

Read more about Scientific Computing at Jefferson Lab
Log in to post comments

Wide Area Networking

Tue, 02/12/2013 - 17:14 — watson

Jefferson Lab has a 10g wide area networking connection to a MAN (metropolitan area network) with a 10g connection up to ESnet in Washington D.C. and a redundant 10g connection to ESnet in Atlanta. JLab can reasonably use 5 Gbps of this, and Scientific Computing can reasonably use 4 Gpbs. Thus each of CLAS, GlueX, A+C+misc, and LQCD can on average use 1 Gbps, although may on occasion find they can sustain 5-6 Gbps.

Read more about Wide Area Networking

Tape Library (offline storage)

Tue, 02/12/2013 - 16:15 — watson

IBM TS3500 Tape Library

The JLab Mass Storage System (MSS) is an IBM TS3500 tape library with LTO drives, installed in 2008 to replace JLab's original StorageTek silo with Redwood technology. The TS3500 is a modular system, with an expandable number of frames for tape slots and drives, and an expandable number of tape drives. The lab's JASMine software provides the user interface to the MSS.

Our current configuration consists of

Read more about Tape Library (offline storage)

Experimental Physics File System Layout

Mon, 02/11/2013 - 12:59 — watson

Experimental Physics users see a file system layout with many parts:

/home: is a file system accessible from all CUE nodes, and is the user's normal home directory, held on central file servers.

/group: is a file system accessible from all CUE nodes, and is a shared space for a group such as an experiment, held on central file servers.

Read more about Experimental Physics File System Layout

HPC / LQCD File System Layout

Mon, 02/11/2013 - 12:57 — watson

LQCD / HPC users see a file system layout with 5 parts:

/home: Every user will have a home directory after he/she gets an account. Note that for performance and fault tolerance, this is a different home directory from the general lab computing home directory. With a default user quota of 2 GB, /home is not a large file system and is backed up daily. It is designed to store non-data files, such as scripts and executables. The home directory is mounted on interactive nodes and compute nodes. This disk space is managed by individual users.

Read more about HPC / LQCD File System Layout

Disk Servers (online storage)

Mon, 02/11/2013 - 12:56 — watson

Scientific Computing currently has 2 (physical) types of file servers:

Read more about Disk Servers (online storage)

Experimental Physics Computing

Mon, 02/11/2013 - 12:54 — watson

The batch farm contains ~300 CentOS 7.7 nodes, with 8, 16, 24, 32, or 36 cores. Each core is run with two hardware threads, for two job slots per core, for a total of ~24000 job slots

Read more about Experimental Physics Computing

HPC / LQCD Computing Systems

Mon, 02/11/2013 - 11:05 — watson

The HPC / LQCD computing resources includes a Xeon Phi (Knights Landing) + OmniPath cluster, a Xeon + Infiniband cluster, an NVIDIA GeForce RTX 2080 + Infiniband cluster, and an NVIDIA Kepler

Read more about HPC / LQCD Computing Systems

scicomp.jlab.org

Main menu

Navigation

Infiniband Fabric

GPU Specifications

Scientific Computing at Jefferson Lab

Wide Area Networking

Tape Library (offline storage)

IBM TS3500 Tape Library

Experimental Physics File System Layout

HPC / LQCD File System Layout

Disk Servers (online storage)

Experimental Physics Computing

HPC / LQCD Computing Systems

Pages

Main menu

Navigation

User login

IBM TS3500 Tape Library

Pages